Network is under initialization...
Network successfully initialized.
INFO: Downloading File to /root/OFAKD-DARTS1/...

Succeed: Total num: 152, size: 370,888,069. OK num: 152(download 152 objects).

average speed 197806000(byte/s)

1.879037(s) elapsed
INFO: Downloading succeed.
INFO: Try to dump ENV REQUIREMENTS_TEXT to /tmp/requirements.txt.
INFO: Try to install requirements from /tmp/requirements.txt.
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Collecting timm>=0.9
  Downloading https://mirrors.aliyun.com/pypi/packages/68/99/2018622d268f6017ddfa5ee71f070bad5d07590374793166baa102849d17/timm-0.9.16-py3-none-any.whl (2.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.2/2.2 MB 5.1 MB/s eta 0:00:00
Requirement already satisfied: safetensors in /usr/local/lib/python3.8/dist-packages (from timm>=0.9->-r /tmp/requirements.txt (line 1)) (0.3.0)
Requirement already satisfied: torch in /usr/local/lib/python3.8/dist-packages (from timm>=0.9->-r /tmp/requirements.txt (line 1)) (2.0.0+cu117)
Requirement already satisfied: huggingface_hub in /usr/local/lib/python3.8/dist-packages (from timm>=0.9->-r /tmp/requirements.txt (line 1)) (0.15.1)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.8/dist-packages (from timm>=0.9->-r /tmp/requirements.txt (line 1)) (5.4.1)
Requirement already satisfied: torchvision in /usr/local/lib/python3.8/dist-packages (from timm>=0.9->-r /tmp/requirements.txt (line 1)) (0.15.1+cu117)
Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.8/dist-packages (from huggingface_hub->timm>=0.9->-r /tmp/requirements.txt (line 1)) (23.0)
Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from huggingface_hub->timm>=0.9->-r /tmp/requirements.txt (line 1)) (2.28.2)
Requirement already satisfied: fsspec in /usr/local/lib/python3.8/dist-packages (from huggingface_hub->timm>=0.9->-r /tmp/requirements.txt (line 1)) (2023.3.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.8/dist-packages (from huggingface_hub->timm>=0.9->-r /tmp/requirements.txt (line 1)) (4.5.0)
Requirement already satisfied: tqdm>=4.42.1 in /usr/local/lib/python3.8/dist-packages (from huggingface_hub->timm>=0.9->-r /tmp/requirements.txt (line 1)) (4.65.0)
Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from huggingface_hub->timm>=0.9->-r /tmp/requirements.txt (line 1)) (3.9.1)
Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.8/dist-packages (from torch->timm>=0.9->-r /tmp/requirements.txt (line 1)) (2.0.0)
Requirement already satisfied: networkx in /usr/local/lib/python3.8/dist-packages (from torch->timm>=0.9->-r /tmp/requirements.txt (line 1)) (3.1)
Requirement already satisfied: sympy in /usr/local/lib/python3.8/dist-packages (from torch->timm>=0.9->-r /tmp/requirements.txt (line 1)) (1.11.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.8/dist-packages (from torch->timm>=0.9->-r /tmp/requirements.txt (line 1)) (3.1.2)
Requirement already satisfied: cmake in /usr/local/lib/python3.8/dist-packages (from triton==2.0.0->torch->timm>=0.9->-r /tmp/requirements.txt (line 1)) (3.26.3)
Requirement already satisfied: lit in /usr/local/lib/python3.8/dist-packages (from triton==2.0.0->torch->timm>=0.9->-r /tmp/requirements.txt (line 1)) (16.0.1)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /usr/local/lib/python3.8/dist-packages (from torchvision->timm>=0.9->-r /tmp/requirements.txt (line 1)) (9.4.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.8/dist-packages (from torchvision->timm>=0.9->-r /tmp/requirements.txt (line 1)) (1.23.5)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.8/dist-packages (from jinja2->torch->timm>=0.9->-r /tmp/requirements.txt (line 1)) (2.1.2)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub->timm>=0.9->-r /tmp/requirements.txt (line 1)) (1.26.15)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub->timm>=0.9->-r /tmp/requirements.txt (line 1)) (3.4)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub->timm>=0.9->-r /tmp/requirements.txt (line 1)) (3.1.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub->timm>=0.9->-r /tmp/requirements.txt (line 1)) (2022.12.7)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.8/dist-packages (from sympy->torch->timm>=0.9->-r /tmp/requirements.txt (line 1)) (1.3.0)
Installing collected packages: timm
  Attempting uninstall: timm
    Found existing installation: timm 0.6.12
    Uninstalling timm-0.6.12:
      Successfully uninstalled timm-0.6.12
Successfully installed timm-0.9.16
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

[notice] A new release of pip is available: 23.0.1 -> 24.0
[notice] To update, run: python -m pip install --upgrade pip
Training with a single process on 1 GPUs.
Data processing configuration for current model + dataset:
	input_size: (3, 32, 32)
	interpolation: bilinear
	mean: (0.49139968, 0.48215827, 0.44653124)
	std: (0.24703233, 0.24348505, 0.26158768)
	crop_pct: 1.0
	crop_mode: center

-------------------------------
Learnable parameters
Student: 1.93M
Extra: 0.00M
-------------------------------
Scheduled epochs: 50
p_max: 0.125
search_space = s5
Using downloaded and verified file: /mnt/OFAKD-DARTS1/data/cifar-10-python.tar.gz
Extracting /mnt/OFAKD-DARTS1/data/cifar-10-python.tar.gz to /mnt/OFAKD-DARTS1/data
Train: 0 [   0/390]  Loss: 2.441 (2.44)  Acc@1: 10.9375 (10.9375)  Acc@5: 42.1875 (42.1875)LR: 2.500e-02
Train: 0 [  50/390]  Loss: 1.939 (2.02)  Acc@1: 23.4375 (25.0000)  Acc@5: 82.8125 (79.6569)LR: 2.500e-02
Train: 0 [ 100/390]  Loss: 1.764 (1.89)  Acc@1: 34.3750 (29.6875)  Acc@5: 89.0625 (83.4623)LR: 2.500e-02
Train: 0 [ 150/390]  Loss: 1.551 (1.82)  Acc@1: 40.6250 (32.2227)  Acc@5: 90.6250 (85.1407)LR: 2.500e-02
Train: 0 [ 200/390]  Loss: 1.805 (1.77)  Acc@1: 31.2500 (34.2662)  Acc@5: 84.3750 (86.3029)LR: 2.500e-02
Train: 0 [ 250/390]  Loss: 1.426 (1.72)  Acc@1: 46.8750 (36.3172)  Acc@5: 93.7500 (87.0642)LR: 2.500e-02
Train: 0 [ 300/390]  Loss: 1.409 (1.67)  Acc@1: 45.3125 (38.0035)  Acc@5: 92.1875 (88.0243)LR: 2.500e-02
Train: 0 [ 350/390]  Loss: 1.268 (1.64)  Acc@1: 60.9375 (39.5700)  Acc@5: 95.3125 (88.8088)LR: 2.500e-02
Train: 0 [ 390/390]  Loss: 1.174 (1.60)  Acc@1: 55.0000 (40.7520)  Acc@5: 95.0000 (89.2680)LR: 2.500e-02
train_acc 40.752000
Valid: 0 [   0/390]  Loss: 1.419 (1.42)  Acc@1: 48.4375 (48.4375)  Acc@5: 95.3125 (95.3125)
Valid: 0 [  50/390]  Loss: 1.449 (1.39)  Acc@1: 50.0000 (51.3787)  Acc@5: 90.6250 (92.8922)
Valid: 0 [ 100/390]  Loss: 1.449 (1.38)  Acc@1: 46.8750 (51.4542)  Acc@5: 95.3125 (92.8218)
Valid: 0 [ 150/390]  Loss: 1.368 (1.38)  Acc@1: 45.3125 (51.4901)  Acc@5: 96.8750 (92.9325)
Valid: 0 [ 200/390]  Loss: 1.447 (1.37)  Acc@1: 48.4375 (51.7102)  Acc@5: 92.1875 (93.0504)
Valid: 0 [ 250/390]  Loss: 1.481 (1.38)  Acc@1: 48.4375 (51.4691)  Acc@5: 95.3125 (92.9905)
Valid: 0 [ 300/390]  Loss: 1.509 (1.38)  Acc@1: 50.0000 (51.3133)  Acc@5: 89.0625 (92.9817)
Valid: 0 [ 350/390]  Loss: 1.265 (1.38)  Acc@1: 50.0000 (51.3622)  Acc@5: 95.3125 (93.0556)
Valid: 0 [ 390/390]  Loss: 1.545 (1.38)  Acc@1: 47.5000 (51.3760)  Acc@5: 90.0000 (93.0960)
valid_acc 51.376000
epoch = 0   
 genotype = Genotype(normal=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_5x5', 2), ('dil_conv_5x5', 1), ('dil_conv_3x3', 1), ('dil_conv_3x3', 2), ('dil_conv_5x5', 1), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('max_pool_3x3', 0), ('sep_conv_5x5', 1), ('dil_conv_3x3', 2), ('max_pool_3x3', 0), ('dil_conv_3x3', 3), ('sep_conv_3x3', 0), ('dil_conv_5x5', 2), ('sep_conv_5x5', 1)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1243, 0.1270, 0.1196, 0.1228, 0.1258, 0.1283, 0.1259, 0.1262],
        [0.1292, 0.1232, 0.1184, 0.1209, 0.1274, 0.1272, 0.1266, 0.1270],
        [0.1253, 0.1267, 0.1199, 0.1228, 0.1281, 0.1272, 0.1244, 0.1255],
        [0.1291, 0.1228, 0.1183, 0.1199, 0.1263, 0.1274, 0.1280, 0.1283],
        [0.1292, 0.1220, 0.1182, 0.1219, 0.1263, 0.1286, 0.1271, 0.1268],
        [0.1261, 0.1254, 0.1196, 0.1221, 0.1284, 0.1268, 0.1250, 0.1264],
        [0.1283, 0.1220, 0.1181, 0.1198, 0.1275, 0.1269, 0.1300, 0.1273],
        [0.1293, 0.1205, 0.1179, 0.1208, 0.1271, 0.1274, 0.1287, 0.1284],
        [0.1303, 0.1203, 0.1180, 0.1207, 0.1272, 0.1285, 0.1276, 0.1276],
        [0.1266, 0.1257, 0.1201, 0.1219, 0.1275, 0.1268, 0.1267, 0.1247],
        [0.1305, 0.1218, 0.1186, 0.1198, 0.1270, 0.1272, 0.1265, 0.1287],
        [0.1310, 0.1206, 0.1183, 0.1217, 0.1268, 0.1265, 0.1279, 0.1272],
        [0.1303, 0.1199, 0.1179, 0.1197, 0.1281, 0.1273, 0.1287, 0.1281],
        [0.1307, 0.1203, 0.1183, 0.1197, 0.1274, 0.1285, 0.1277, 0.1273]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1243, 0.1272, 0.1242, 0.1257, 0.1259, 0.1250, 0.1239, 0.1239],
        [0.1256, 0.1252, 0.1225, 0.1237, 0.1261, 0.1264, 0.1256, 0.1248],
        [0.1235, 0.1267, 0.1239, 0.1244, 0.1251, 0.1253, 0.1248, 0.1263],
        [0.1251, 0.1249, 0.1222, 0.1258, 0.1243, 0.1263, 0.1249, 0.1265],
        [0.1256, 0.1245, 0.1206, 0.1241, 0.1248, 0.1258, 0.1284, 0.1262],
        [0.1242, 0.1254, 0.1228, 0.1259, 0.1274, 0.1262, 0.1245, 0.1236],
        [0.1258, 0.1244, 0.1213, 0.1252, 0.1265, 0.1253, 0.1254, 0.1260],
        [0.1283, 0.1238, 0.1197, 0.1239, 0.1264, 0.1273, 0.1241, 0.1265],
        [0.1266, 0.1234, 0.1191, 0.1217, 0.1273, 0.1279, 0.1281, 0.1260],
        [0.1234, 0.1272, 0.1237, 0.1244, 0.1254, 0.1246, 0.1263, 0.1249],
        [0.1250, 0.1254, 0.1227, 0.1238, 0.1259, 0.1273, 0.1240, 0.1260],
        [0.1255, 0.1248, 0.1205, 0.1240, 0.1254, 0.1271, 0.1245, 0.1283],
        [0.1255, 0.1255, 0.1205, 0.1242, 0.1251, 0.1271, 0.1257, 0.1264],
        [0.1257, 0.1249, 0.1203, 0.1235, 0.1253, 0.1272, 0.1265, 0.1268]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 1 [   0/390]  Loss: 1.220 (1.22)  Acc@1: 51.5625 (51.5625)  Acc@5: 96.8750 (96.8750)LR: 2.498e-02
Train: 1 [  50/390]  Loss: 1.382 (1.31)  Acc@1: 48.4375 (51.5931)  Acc@5: 96.8750 (94.3321)LR: 2.498e-02
Train: 1 [ 100/390]  Loss: 1.125 (1.29)  Acc@1: 64.0625 (52.8620)  Acc@5: 96.8750 (94.2760)LR: 2.498e-02
Train: 1 [ 150/390]  Loss: 0.9839 (1.26)  Acc@1: 64.0625 (54.0356)  Acc@5: 95.3125 (94.7434)LR: 2.498e-02
Train: 1 [ 200/390]  Loss: 1.162 (1.23)  Acc@1: 57.8125 (55.2239)  Acc@5: 95.3125 (94.8616)LR: 2.498e-02
Train: 1 [ 250/390]  Loss: 1.225 (1.21)  Acc@1: 57.8125 (55.9885)  Acc@5: 93.7500 (95.1008)LR: 2.498e-02
Train: 1 [ 300/390]  Loss: 1.298 (1.20)  Acc@1: 51.5625 (56.4680)  Acc@5: 93.7500 (95.2502)LR: 2.498e-02
Train: 1 [ 350/390]  Loss: 1.015 (1.19)  Acc@1: 64.0625 (57.0913)  Acc@5: 98.4375 (95.3748)LR: 2.498e-02
Train: 1 [ 390/390]  Loss: 1.184 (1.17)  Acc@1: 52.5000 (57.6080)  Acc@5: 97.5000 (95.4720)LR: 2.498e-02
train_acc 57.608000
Valid: 1 [   0/390]  Loss: 0.9546 (0.955)  Acc@1: 71.8750 (71.8750)  Acc@5: 92.1875 (92.1875)
Valid: 1 [  50/390]  Loss: 0.9499 (1.08)  Acc@1: 65.6250 (62.5306)  Acc@5: 98.4375 (96.0478)
Valid: 1 [ 100/390]  Loss: 1.097 (1.08)  Acc@1: 65.6250 (62.9332)  Acc@5: 92.1875 (95.9623)
Valid: 1 [ 150/390]  Loss: 1.070 (1.06)  Acc@1: 57.8125 (63.3692)  Acc@5: 96.8750 (95.9954)
Valid: 1 [ 200/390]  Loss: 0.8957 (1.06)  Acc@1: 68.7500 (63.2774)  Acc@5: 92.1875 (96.0199)
Valid: 1 [ 250/390]  Loss: 1.164 (1.06)  Acc@1: 54.6875 (63.0727)  Acc@5: 95.3125 (96.0533)
Valid: 1 [ 300/390]  Loss: 1.100 (1.05)  Acc@1: 62.5000 (63.1229)  Acc@5: 95.3125 (96.0237)
Valid: 1 [ 350/390]  Loss: 0.8737 (1.06)  Acc@1: 73.4375 (62.9897)  Acc@5: 96.8750 (96.0515)
Valid: 1 [ 390/390]  Loss: 0.7840 (1.06)  Acc@1: 67.5000 (62.8920)  Acc@5: 100.0000 (96.0480)
valid_acc 62.892000
epoch = 1   
 genotype = Genotype(normal=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_5x5', 2), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('dil_conv_5x5', 2), ('sep_conv_5x5', 4), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('dil_conv_3x3', 2), ('max_pool_3x3', 0), ('sep_conv_5x5', 3), ('sep_conv_5x5', 2), ('dil_conv_5x5', 2), ('max_pool_3x3', 0)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1244, 0.1276, 0.1146, 0.1205, 0.1275, 0.1326, 0.1270, 0.1258],
        [0.1309, 0.1232, 0.1137, 0.1187, 0.1295, 0.1282, 0.1290, 0.1267],
        [0.1253, 0.1278, 0.1153, 0.1207, 0.1314, 0.1292, 0.1241, 0.1262],
        [0.1309, 0.1228, 0.1140, 0.1176, 0.1273, 0.1294, 0.1302, 0.1278],
        [0.1301, 0.1215, 0.1138, 0.1203, 0.1256, 0.1324, 0.1268, 0.1296],
        [0.1275, 0.1263, 0.1149, 0.1197, 0.1330, 0.1280, 0.1247, 0.1257],
        [0.1294, 0.1221, 0.1137, 0.1172, 0.1304, 0.1309, 0.1300, 0.1263],
        [0.1315, 0.1184, 0.1128, 0.1177, 0.1287, 0.1300, 0.1294, 0.1316],
        [0.1340, 0.1180, 0.1132, 0.1181, 0.1292, 0.1315, 0.1284, 0.1275],
        [0.1276, 0.1271, 0.1161, 0.1197, 0.1296, 0.1288, 0.1261, 0.1250],
        [0.1340, 0.1210, 0.1144, 0.1170, 0.1288, 0.1295, 0.1256, 0.1295],
        [0.1342, 0.1192, 0.1138, 0.1191, 0.1292, 0.1288, 0.1286, 0.1270],
        [0.1342, 0.1173, 0.1131, 0.1164, 0.1289, 0.1291, 0.1316, 0.1294],
        [0.1342, 0.1175, 0.1137, 0.1161, 0.1278, 0.1318, 0.1307, 0.1283]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1230, 0.1280, 0.1226, 0.1249, 0.1290, 0.1256, 0.1235, 0.1233],
        [0.1274, 0.1266, 0.1211, 0.1226, 0.1268, 0.1272, 0.1240, 0.1243],
        [0.1236, 0.1283, 0.1231, 0.1229, 0.1250, 0.1256, 0.1250, 0.1264],
        [0.1254, 0.1260, 0.1207, 0.1256, 0.1232, 0.1260, 0.1250, 0.1280],
        [0.1256, 0.1239, 0.1158, 0.1225, 0.1266, 0.1266, 0.1298, 0.1293],
        [0.1221, 0.1275, 0.1222, 0.1261, 0.1274, 0.1263, 0.1247, 0.1238],
        [0.1264, 0.1249, 0.1191, 0.1254, 0.1258, 0.1258, 0.1250, 0.1275],
        [0.1290, 0.1244, 0.1155, 0.1244, 0.1279, 0.1285, 0.1221, 0.1281],
        [0.1275, 0.1226, 0.1161, 0.1219, 0.1282, 0.1299, 0.1287, 0.1252],
        [0.1214, 0.1296, 0.1230, 0.1257, 0.1259, 0.1236, 0.1259, 0.1250],
        [0.1265, 0.1258, 0.1203, 0.1239, 0.1259, 0.1286, 0.1226, 0.1264],
        [0.1258, 0.1250, 0.1163, 0.1233, 0.1260, 0.1291, 0.1237, 0.1308],
        [0.1258, 0.1251, 0.1173, 0.1239, 0.1262, 0.1293, 0.1261, 0.1263],
        [0.1257, 0.1231, 0.1163, 0.1218, 0.1269, 0.1292, 0.1283, 0.1287]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 2 [   0/390]  Loss: 1.036 (1.04)  Acc@1: 60.9375 (60.9375)  Acc@5: 96.8750 (96.8750)LR: 2.491e-02
Train: 2 [  50/390]  Loss: 1.112 (1.01)  Acc@1: 60.9375 (63.7255)  Acc@5: 95.3125 (96.5074)LR: 2.491e-02
Train: 2 [ 100/390]  Loss: 1.225 (1.00)  Acc@1: 56.2500 (64.4183)  Acc@5: 93.7500 (96.7358)LR: 2.491e-02
Train: 2 [ 150/390]  Loss: 0.9838 (0.996)  Acc@1: 64.0625 (64.7765)  Acc@5: 95.3125 (96.7301)LR: 2.491e-02
Train: 2 [ 200/390]  Loss: 0.9381 (0.993)  Acc@1: 60.9375 (64.7932)  Acc@5: 98.4375 (96.7739)LR: 2.491e-02
Train: 2 [ 250/390]  Loss: 0.8543 (0.992)  Acc@1: 70.3125 (64.7784)  Acc@5: 98.4375 (96.8190)LR: 2.491e-02
Train: 2 [ 300/390]  Loss: 1.091 (0.985)  Acc@1: 59.3750 (65.0073)  Acc@5: 95.3125 (96.9373)LR: 2.491e-02
Train: 2 [ 350/390]  Loss: 0.8550 (0.979)  Acc@1: 76.5625 (65.2377)  Acc@5: 98.4375 (96.9195)LR: 2.491e-02
Train: 2 [ 390/390]  Loss: 0.7417 (0.973)  Acc@1: 75.0000 (65.5760)  Acc@5: 100.0000 (96.9720)LR: 2.491e-02
train_acc 65.576000
Valid: 2 [   0/390]  Loss: 0.8883 (0.888)  Acc@1: 75.0000 (75.0000)  Acc@5: 93.7500 (93.7500)
Valid: 2 [  50/390]  Loss: 0.8829 (0.944)  Acc@1: 68.7500 (67.4020)  Acc@5: 98.4375 (96.6299)
Valid: 2 [ 100/390]  Loss: 0.9489 (0.931)  Acc@1: 68.7500 (67.9146)  Acc@5: 95.3125 (96.5811)
Valid: 2 [ 150/390]  Loss: 0.6117 (0.919)  Acc@1: 85.9375 (68.6258)  Acc@5: 98.4375 (96.4921)
Valid: 2 [ 200/390]  Loss: 1.153 (0.920)  Acc@1: 65.6250 (68.4546)  Acc@5: 93.7500 (96.4397)
Valid: 2 [ 250/390]  Loss: 0.8586 (0.911)  Acc@1: 64.0625 (68.6255)  Acc@5: 98.4375 (96.5575)
Valid: 2 [ 300/390]  Loss: 0.9539 (0.908)  Acc@1: 64.0625 (68.7448)  Acc@5: 98.4375 (96.6051)
Valid: 2 [ 350/390]  Loss: 0.8815 (0.911)  Acc@1: 67.1875 (68.5230)  Acc@5: 96.8750 (96.5990)
Valid: 2 [ 390/390]  Loss: 0.8976 (0.910)  Acc@1: 70.0000 (68.5320)  Acc@5: 100.0000 (96.6320)
valid_acc 68.532000
epoch = 2   
 genotype = Genotype(normal=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_5x5', 2), ('sep_conv_3x3', 0), ('sep_conv_5x5', 3), ('sep_conv_5x5', 4), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('max_pool_3x3', 1), ('dil_conv_3x3', 2), ('max_pool_3x3', 0), ('sep_conv_5x5', 3), ('sep_conv_5x5', 2), ('sep_conv_5x5', 4), ('sep_conv_5x5', 3)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1246, 0.1282, 0.1114, 0.1192, 0.1293, 0.1358, 0.1269, 0.1247],
        [0.1315, 0.1227, 0.1098, 0.1169, 0.1320, 0.1304, 0.1294, 0.1272],
        [0.1257, 0.1289, 0.1123, 0.1196, 0.1331, 0.1297, 0.1234, 0.1273],
        [0.1328, 0.1224, 0.1107, 0.1164, 0.1276, 0.1321, 0.1304, 0.1277],
        [0.1303, 0.1208, 0.1105, 0.1191, 0.1279, 0.1326, 0.1284, 0.1304],
        [0.1282, 0.1268, 0.1116, 0.1181, 0.1352, 0.1307, 0.1245, 0.1249],
        [0.1309, 0.1210, 0.1098, 0.1147, 0.1310, 0.1335, 0.1317, 0.1274],
        [0.1339, 0.1169, 0.1090, 0.1165, 0.1290, 0.1311, 0.1307, 0.1329],
        [0.1378, 0.1156, 0.1092, 0.1163, 0.1297, 0.1341, 0.1283, 0.1289],
        [0.1288, 0.1292, 0.1139, 0.1192, 0.1312, 0.1289, 0.1244, 0.1244],
        [0.1363, 0.1212, 0.1113, 0.1155, 0.1305, 0.1304, 0.1259, 0.1289],
        [0.1352, 0.1186, 0.1105, 0.1175, 0.1310, 0.1307, 0.1300, 0.1265],
        [0.1377, 0.1155, 0.1097, 0.1146, 0.1295, 0.1303, 0.1319, 0.1308],
        [0.1369, 0.1153, 0.1101, 0.1136, 0.1294, 0.1334, 0.1311, 0.1302]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1234, 0.1281, 0.1218, 0.1241, 0.1310, 0.1271, 0.1221, 0.1222],
        [0.1275, 0.1277, 0.1208, 0.1229, 0.1265, 0.1271, 0.1228, 0.1247],
        [0.1242, 0.1278, 0.1217, 0.1229, 0.1246, 0.1267, 0.1262, 0.1258],
        [0.1262, 0.1277, 0.1207, 0.1252, 0.1236, 0.1254, 0.1238, 0.1275],
        [0.1242, 0.1214, 0.1120, 0.1204, 0.1289, 0.1284, 0.1327, 0.1320],
        [0.1226, 0.1292, 0.1228, 0.1239, 0.1268, 0.1260, 0.1254, 0.1232],
        [0.1260, 0.1254, 0.1177, 0.1249, 0.1260, 0.1266, 0.1239, 0.1294],
        [0.1291, 0.1236, 0.1124, 0.1236, 0.1290, 0.1298, 0.1232, 0.1292],
        [0.1278, 0.1228, 0.1150, 0.1226, 0.1273, 0.1314, 0.1288, 0.1243],
        [0.1208, 0.1304, 0.1227, 0.1258, 0.1245, 0.1235, 0.1262, 0.1261],
        [0.1255, 0.1262, 0.1186, 0.1230, 0.1248, 0.1314, 0.1217, 0.1288],
        [0.1281, 0.1231, 0.1124, 0.1232, 0.1275, 0.1309, 0.1232, 0.1317],
        [0.1248, 0.1233, 0.1143, 0.1229, 0.1273, 0.1320, 0.1282, 0.1271],
        [0.1262, 0.1206, 0.1124, 0.1191, 0.1276, 0.1321, 0.1302, 0.1317]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 3 [   0/390]  Loss: 0.9627 (0.963)  Acc@1: 70.3125 (70.3125)  Acc@5: 100.0000 (100.0000)LR: 2.479e-02
Train: 3 [  50/390]  Loss: 0.8194 (0.866)  Acc@1: 73.4375 (69.2402)  Acc@5: 98.4375 (97.7635)LR: 2.479e-02
Train: 3 [ 100/390]  Loss: 0.9446 (0.860)  Acc@1: 59.3750 (69.5545)  Acc@5: 96.8750 (97.7259)LR: 2.479e-02
Train: 3 [ 150/390]  Loss: 0.8118 (0.845)  Acc@1: 65.6250 (69.8469)  Acc@5: 96.8750 (97.7546)LR: 2.479e-02
Train: 3 [ 200/390]  Loss: 1.058 (0.845)  Acc@1: 67.1875 (70.0326)  Acc@5: 95.3125 (97.7146)LR: 2.479e-02
Train: 3 [ 250/390]  Loss: 0.9354 (0.841)  Acc@1: 75.0000 (70.2689)  Acc@5: 93.7500 (97.6905)LR: 2.479e-02
Train: 3 [ 300/390]  Loss: 0.7083 (0.842)  Acc@1: 79.6875 (70.2606)  Acc@5: 96.8750 (97.7367)LR: 2.479e-02
Train: 3 [ 350/390]  Loss: 0.6642 (0.835)  Acc@1: 73.4375 (70.5707)  Acc@5: 100.0000 (97.7564)LR: 2.479e-02
Train: 3 [ 390/390]  Loss: 0.8200 (0.829)  Acc@1: 72.5000 (70.8240)  Acc@5: 97.5000 (97.8280)LR: 2.479e-02
train_acc 70.824000
Valid: 3 [   0/390]  Loss: 0.8985 (0.899)  Acc@1: 67.1875 (67.1875)  Acc@5: 98.4375 (98.4375)
Valid: 3 [  50/390]  Loss: 0.7741 (0.819)  Acc@1: 68.7500 (72.2733)  Acc@5: 98.4375 (97.9167)
Valid: 3 [ 100/390]  Loss: 0.9510 (0.836)  Acc@1: 73.4375 (71.4264)  Acc@5: 93.7500 (97.5866)
Valid: 3 [ 150/390]  Loss: 0.7897 (0.835)  Acc@1: 70.3125 (71.3887)  Acc@5: 98.4375 (97.5993)
Valid: 3 [ 200/390]  Loss: 0.7660 (0.836)  Acc@1: 78.1250 (71.4630)  Acc@5: 98.4375 (97.5513)
Valid: 3 [ 250/390]  Loss: 0.7833 (0.839)  Acc@1: 76.5625 (71.4392)  Acc@5: 98.4375 (97.4477)
Valid: 3 [ 300/390]  Loss: 0.8476 (0.839)  Acc@1: 75.0000 (71.3507)  Acc@5: 95.3125 (97.4149)
Valid: 3 [ 350/390]  Loss: 0.8743 (0.833)  Acc@1: 71.8750 (71.4387)  Acc@5: 95.3125 (97.4403)
Valid: 3 [ 390/390]  Loss: 0.8435 (0.832)  Acc@1: 72.5000 (71.4080)  Acc@5: 97.5000 (97.4280)
valid_acc 71.408000
epoch = 3   
 genotype = Genotype(normal=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 2), ('sep_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('max_pool_3x3', 1), ('dil_conv_3x3', 2), ('max_pool_3x3', 1), ('sep_conv_5x5', 3), ('max_pool_3x3', 0), ('sep_conv_5x5', 3), ('sep_conv_5x5', 4)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1250, 0.1284, 0.1078, 0.1177, 0.1308, 0.1386, 0.1266, 0.1251],
        [0.1326, 0.1215, 0.1059, 0.1147, 0.1365, 0.1316, 0.1301, 0.1270],
        [0.1272, 0.1287, 0.1089, 0.1182, 0.1365, 0.1300, 0.1217, 0.1288],
        [0.1337, 0.1216, 0.1078, 0.1155, 0.1284, 0.1347, 0.1300, 0.1282],
        [0.1311, 0.1187, 0.1067, 0.1179, 0.1304, 0.1331, 0.1306, 0.1316],
        [0.1298, 0.1268, 0.1085, 0.1171, 0.1374, 0.1312, 0.1243, 0.1250],
        [0.1327, 0.1196, 0.1064, 0.1127, 0.1325, 0.1355, 0.1318, 0.1286],
        [0.1360, 0.1151, 0.1055, 0.1154, 0.1309, 0.1335, 0.1297, 0.1339],
        [0.1422, 0.1133, 0.1058, 0.1147, 0.1310, 0.1355, 0.1280, 0.1293],
        [0.1294, 0.1305, 0.1119, 0.1189, 0.1330, 0.1285, 0.1245, 0.1232],
        [0.1398, 0.1204, 0.1085, 0.1138, 0.1320, 0.1320, 0.1258, 0.1277],
        [0.1370, 0.1166, 0.1071, 0.1157, 0.1356, 0.1310, 0.1305, 0.1264],
        [0.1407, 0.1132, 0.1063, 0.1123, 0.1317, 0.1310, 0.1319, 0.1330],
        [0.1411, 0.1125, 0.1067, 0.1109, 0.1296, 0.1352, 0.1316, 0.1324]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1226, 0.1292, 0.1222, 0.1222, 0.1338, 0.1284, 0.1190, 0.1225],
        [0.1280, 0.1284, 0.1200, 0.1233, 0.1274, 0.1257, 0.1229, 0.1243],
        [0.1229, 0.1286, 0.1219, 0.1239, 0.1263, 0.1255, 0.1252, 0.1257],
        [0.1266, 0.1292, 0.1208, 0.1241, 0.1243, 0.1255, 0.1232, 0.1263],
        [0.1236, 0.1204, 0.1099, 0.1202, 0.1312, 0.1290, 0.1331, 0.1326],
        [0.1220, 0.1314, 0.1242, 0.1226, 0.1254, 0.1263, 0.1249, 0.1232],
        [0.1270, 0.1253, 0.1167, 0.1239, 0.1269, 0.1268, 0.1233, 0.1301],
        [0.1294, 0.1225, 0.1101, 0.1241, 0.1311, 0.1300, 0.1228, 0.1300],
        [0.1284, 0.1212, 0.1136, 0.1231, 0.1275, 0.1327, 0.1294, 0.1240],
        [0.1200, 0.1318, 0.1242, 0.1261, 0.1229, 0.1247, 0.1255, 0.1249],
        [0.1251, 0.1262, 0.1176, 0.1222, 0.1244, 0.1327, 0.1226, 0.1291],
        [0.1299, 0.1220, 0.1105, 0.1242, 0.1279, 0.1311, 0.1220, 0.1324],
        [0.1257, 0.1216, 0.1123, 0.1230, 0.1277, 0.1337, 0.1287, 0.1272],
        [0.1273, 0.1182, 0.1105, 0.1182, 0.1296, 0.1331, 0.1305, 0.1326]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 4 [   0/390]  Loss: 0.6874 (0.687)  Acc@1: 78.1250 (78.1250)  Acc@5: 100.0000 (100.0000)LR: 2.462e-02
Train: 4 [  50/390]  Loss: 0.8206 (0.811)  Acc@1: 68.7500 (71.8444)  Acc@5: 95.3125 (97.8554)LR: 2.462e-02
Train: 4 [ 100/390]  Loss: 0.6331 (0.791)  Acc@1: 76.5625 (72.3236)  Acc@5: 100.0000 (97.8806)LR: 2.462e-02
Train: 4 [ 150/390]  Loss: 0.7360 (0.772)  Acc@1: 71.8750 (73.0857)  Acc@5: 100.0000 (97.9822)LR: 2.462e-02
Train: 4 [ 200/390]  Loss: 0.8915 (0.775)  Acc@1: 73.4375 (73.1110)  Acc@5: 95.3125 (97.9711)LR: 2.462e-02
Train: 4 [ 250/390]  Loss: 0.7634 (0.768)  Acc@1: 73.4375 (73.1138)  Acc@5: 96.8750 (98.1138)LR: 2.462e-02
Train: 4 [ 300/390]  Loss: 0.6700 (0.767)  Acc@1: 73.4375 (73.2299)  Acc@5: 98.4375 (98.0534)LR: 2.462e-02
Train: 4 [ 350/390]  Loss: 0.6324 (0.763)  Acc@1: 76.5625 (73.3752)  Acc@5: 100.0000 (98.0680)LR: 2.462e-02
Train: 4 [ 390/390]  Loss: 0.3461 (0.762)  Acc@1: 92.5000 (73.4240)  Acc@5: 100.0000 (98.0680)LR: 2.462e-02
train_acc 73.424000
Valid: 4 [   0/390]  Loss: 0.6825 (0.683)  Acc@1: 76.5625 (76.5625)  Acc@5: 98.4375 (98.4375)
Valid: 4 [  50/390]  Loss: 1.075 (0.768)  Acc@1: 67.1875 (73.2537)  Acc@5: 98.4375 (98.3456)
Valid: 4 [ 100/390]  Loss: 0.7694 (0.767)  Acc@1: 76.5625 (73.3137)  Acc@5: 96.8750 (98.2673)
Valid: 4 [ 150/390]  Loss: 0.9406 (0.771)  Acc@1: 65.6250 (73.0857)  Acc@5: 98.4375 (98.2616)
Valid: 4 [ 200/390]  Loss: 0.8098 (0.768)  Acc@1: 73.4375 (73.4841)  Acc@5: 96.8750 (98.1810)
Valid: 4 [ 250/390]  Loss: 0.5824 (0.771)  Acc@1: 76.5625 (73.2694)  Acc@5: 98.4375 (98.1760)
Valid: 4 [ 300/390]  Loss: 0.6666 (0.772)  Acc@1: 79.6875 (73.1468)  Acc@5: 100.0000 (98.1468)
Valid: 4 [ 350/390]  Loss: 0.8852 (0.772)  Acc@1: 71.8750 (73.0725)  Acc@5: 92.1875 (98.1170)
Valid: 4 [ 390/390]  Loss: 0.6440 (0.774)  Acc@1: 67.5000 (73.0120)  Acc@5: 100.0000 (98.0520)
valid_acc 73.012000
epoch = 4   
 genotype = Genotype(normal=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 0), ('dil_conv_5x5', 2), ('sep_conv_5x5', 4), ('sep_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 1), ('sep_conv_5x5', 3), ('max_pool_3x3', 0), ('sep_conv_5x5', 3), ('sep_conv_5x5', 1)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1269, 0.1281, 0.1036, 0.1155, 0.1333, 0.1403, 0.1264, 0.1259],
        [0.1318, 0.1211, 0.1022, 0.1128, 0.1389, 0.1343, 0.1313, 0.1275],
        [0.1294, 0.1289, 0.1052, 0.1167, 0.1390, 0.1297, 0.1213, 0.1296],
        [0.1341, 0.1218, 0.1050, 0.1146, 0.1289, 0.1367, 0.1300, 0.1289],
        [0.1330, 0.1177, 0.1031, 0.1170, 0.1307, 0.1341, 0.1320, 0.1325],
        [0.1327, 0.1260, 0.1045, 0.1149, 0.1388, 0.1338, 0.1231, 0.1262],
        [0.1340, 0.1192, 0.1034, 0.1113, 0.1348, 0.1358, 0.1331, 0.1284],
        [0.1397, 0.1128, 0.1013, 0.1140, 0.1326, 0.1341, 0.1292, 0.1363],
        [0.1482, 0.1112, 0.1018, 0.1127, 0.1317, 0.1360, 0.1282, 0.1303],
        [0.1315, 0.1305, 0.1086, 0.1172, 0.1345, 0.1297, 0.1242, 0.1239],
        [0.1416, 0.1209, 0.1059, 0.1128, 0.1316, 0.1338, 0.1254, 0.1281],
        [0.1388, 0.1149, 0.1033, 0.1137, 0.1368, 0.1342, 0.1315, 0.1268],
        [0.1446, 0.1110, 0.1022, 0.1097, 0.1318, 0.1319, 0.1337, 0.1351],
        [0.1457, 0.1100, 0.1022, 0.1069, 0.1286, 0.1375, 0.1339, 0.1353]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1221, 0.1311, 0.1223, 0.1203, 0.1351, 0.1297, 0.1172, 0.1223],
        [0.1282, 0.1289, 0.1187, 0.1221, 0.1291, 0.1262, 0.1225, 0.1243],
        [0.1227, 0.1301, 0.1213, 0.1242, 0.1262, 0.1249, 0.1245, 0.1260],
        [0.1269, 0.1305, 0.1202, 0.1252, 0.1234, 0.1250, 0.1229, 0.1259],
        [0.1221, 0.1214, 0.1075, 0.1200, 0.1319, 0.1305, 0.1327, 0.1339],
        [0.1224, 0.1333, 0.1246, 0.1222, 0.1246, 0.1256, 0.1244, 0.1228],
        [0.1291, 0.1255, 0.1154, 0.1228, 0.1251, 0.1286, 0.1235, 0.1301],
        [0.1284, 0.1225, 0.1072, 0.1235, 0.1323, 0.1327, 0.1220, 0.1315],
        [0.1287, 0.1203, 0.1108, 0.1226, 0.1274, 0.1352, 0.1294, 0.1256],
        [0.1192, 0.1334, 0.1243, 0.1261, 0.1219, 0.1249, 0.1246, 0.1256],
        [0.1268, 0.1263, 0.1163, 0.1206, 0.1243, 0.1349, 0.1212, 0.1297],
        [0.1298, 0.1221, 0.1074, 0.1234, 0.1292, 0.1335, 0.1215, 0.1331],
        [0.1262, 0.1209, 0.1093, 0.1223, 0.1265, 0.1364, 0.1297, 0.1287],
        [0.1276, 0.1178, 0.1081, 0.1170, 0.1304, 0.1346, 0.1314, 0.1332]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 5 [   0/390]  Loss: 0.6537 (0.654)  Acc@1: 75.0000 (75.0000)  Acc@5: 100.0000 (100.0000)LR: 2.441e-02
Train: 5 [  50/390]  Loss: 0.5819 (0.677)  Acc@1: 81.2500 (76.1642)  Acc@5: 98.4375 (99.0196)LR: 2.441e-02
Train: 5 [ 100/390]  Loss: 0.6480 (0.690)  Acc@1: 78.1250 (75.7890)  Acc@5: 100.0000 (98.5922)LR: 2.441e-02
Train: 5 [ 150/390]  Loss: 0.5779 (0.692)  Acc@1: 76.5625 (75.7657)  Acc@5: 100.0000 (98.4685)LR: 2.441e-02
Train: 5 [ 200/390]  Loss: 0.8071 (0.692)  Acc@1: 71.8750 (75.6374)  Acc@5: 98.4375 (98.3753)LR: 2.441e-02
Train: 5 [ 250/390]  Loss: 0.8908 (0.692)  Acc@1: 70.3125 (75.5229)  Acc@5: 98.4375 (98.4375)LR: 2.441e-02
Train: 5 [ 300/390]  Loss: 0.5186 (0.688)  Acc@1: 81.2500 (75.7787)  Acc@5: 100.0000 (98.4998)LR: 2.441e-02
Train: 5 [ 350/390]  Loss: 0.6315 (0.692)  Acc@1: 78.1250 (75.7612)  Acc@5: 96.8750 (98.3885)LR: 2.441e-02
Train: 5 [ 390/390]  Loss: 0.6547 (0.692)  Acc@1: 72.5000 (75.7880)  Acc@5: 100.0000 (98.3920)LR: 2.441e-02
train_acc 75.788000
Valid: 5 [   0/390]  Loss: 0.9312 (0.931)  Acc@1: 70.3125 (70.3125)  Acc@5: 98.4375 (98.4375)
Valid: 5 [  50/390]  Loss: 0.6285 (0.766)  Acc@1: 78.1250 (73.4375)  Acc@5: 100.0000 (98.2230)
Valid: 5 [ 100/390]  Loss: 0.6187 (0.752)  Acc@1: 73.4375 (73.8243)  Acc@5: 100.0000 (98.1745)
Valid: 5 [ 150/390]  Loss: 0.5800 (0.755)  Acc@1: 78.1250 (73.8928)  Acc@5: 100.0000 (98.1892)
Valid: 5 [ 200/390]  Loss: 0.8416 (0.763)  Acc@1: 75.0000 (73.6163)  Acc@5: 96.8750 (98.1732)
Valid: 5 [ 250/390]  Loss: 0.7325 (0.764)  Acc@1: 78.1250 (73.5496)  Acc@5: 93.7500 (98.1636)
Valid: 5 [ 300/390]  Loss: 0.6960 (0.762)  Acc@1: 76.5625 (73.5569)  Acc@5: 96.8750 (98.1987)
Valid: 5 [ 350/390]  Loss: 1.020 (0.765)  Acc@1: 65.6250 (73.4687)  Acc@5: 95.3125 (98.1838)
Valid: 5 [ 390/390]  Loss: 0.4687 (0.760)  Acc@1: 82.5000 (73.5560)  Acc@5: 100.0000 (98.2240)
valid_acc 73.556000
epoch = 5   
 genotype = Genotype(normal=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 2), ('sep_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('max_pool_3x3', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 1), ('sep_conv_5x5', 3), ('sep_conv_5x5', 2), ('sep_conv_5x5', 3), ('sep_conv_5x5', 4)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1273, 0.1265, 0.0998, 0.1135, 0.1349, 0.1432, 0.1283, 0.1265],
        [0.1332, 0.1184, 0.0976, 0.1099, 0.1430, 0.1364, 0.1336, 0.1278],
        [0.1305, 0.1278, 0.1017, 0.1151, 0.1421, 0.1304, 0.1211, 0.1312],
        [0.1351, 0.1200, 0.1013, 0.1128, 0.1307, 0.1385, 0.1311, 0.1304],
        [0.1357, 0.1155, 0.0998, 0.1164, 0.1319, 0.1346, 0.1333, 0.1328],
        [0.1351, 0.1254, 0.1014, 0.1143, 0.1416, 0.1339, 0.1220, 0.1264],
        [0.1366, 0.1176, 0.0998, 0.1097, 0.1357, 0.1374, 0.1341, 0.1291],
        [0.1442, 0.1111, 0.0985, 0.1145, 0.1321, 0.1341, 0.1291, 0.1365],
        [0.1540, 0.1080, 0.0982, 0.1114, 0.1326, 0.1368, 0.1280, 0.1310],
        [0.1334, 0.1292, 0.1050, 0.1153, 0.1363, 0.1312, 0.1248, 0.1246],
        [0.1454, 0.1195, 0.1025, 0.1110, 0.1324, 0.1332, 0.1266, 0.1294],
        [0.1411, 0.1120, 0.0997, 0.1117, 0.1389, 0.1361, 0.1327, 0.1279],
        [0.1495, 0.1074, 0.0982, 0.1070, 0.1319, 0.1348, 0.1345, 0.1367],
        [0.1525, 0.1055, 0.0970, 0.1020, 0.1299, 0.1383, 0.1366, 0.1381]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1226, 0.1315, 0.1218, 0.1193, 0.1353, 0.1323, 0.1146, 0.1228],
        [0.1276, 0.1300, 0.1186, 0.1223, 0.1295, 0.1257, 0.1219, 0.1243],
        [0.1232, 0.1303, 0.1202, 0.1256, 0.1265, 0.1244, 0.1233, 0.1264],
        [0.1256, 0.1318, 0.1203, 0.1262, 0.1229, 0.1248, 0.1224, 0.1260],
        [0.1217, 0.1200, 0.1047, 0.1189, 0.1332, 0.1311, 0.1338, 0.1366],
        [0.1225, 0.1340, 0.1240, 0.1223, 0.1237, 0.1254, 0.1249, 0.1232],
        [0.1281, 0.1263, 0.1154, 0.1224, 0.1264, 0.1285, 0.1235, 0.1295],
        [0.1296, 0.1205, 0.1048, 0.1238, 0.1329, 0.1343, 0.1219, 0.1321],
        [0.1283, 0.1191, 0.1090, 0.1221, 0.1287, 0.1371, 0.1295, 0.1261],
        [0.1204, 0.1333, 0.1240, 0.1259, 0.1215, 0.1251, 0.1243, 0.1253],
        [0.1267, 0.1265, 0.1162, 0.1204, 0.1222, 0.1359, 0.1204, 0.1316],
        [0.1320, 0.1201, 0.1048, 0.1240, 0.1310, 0.1357, 0.1204, 0.1320],
        [0.1261, 0.1191, 0.1069, 0.1214, 0.1274, 0.1379, 0.1309, 0.1303],
        [0.1281, 0.1160, 0.1054, 0.1154, 0.1310, 0.1371, 0.1330, 0.1341]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 6 [   0/390]  Loss: 0.5248 (0.525)  Acc@1: 81.2500 (81.2500)  Acc@5: 98.4375 (98.4375)LR: 2.416e-02
Train: 6 [  50/390]  Loss: 0.6962 (0.642)  Acc@1: 76.5625 (78.0331)  Acc@5: 96.8750 (98.3762)LR: 2.416e-02
Train: 6 [ 100/390]  Loss: 0.4681 (0.636)  Acc@1: 85.9375 (78.2333)  Acc@5: 100.0000 (98.5458)LR: 2.416e-02
Train: 6 [ 150/390]  Loss: 0.5788 (0.635)  Acc@1: 79.6875 (78.3216)  Acc@5: 98.4375 (98.6238)LR: 2.416e-02
Train: 6 [ 200/390]  Loss: 0.5074 (0.627)  Acc@1: 82.8125 (78.2805)  Acc@5: 98.4375 (98.7251)LR: 2.416e-02
Train: 6 [ 250/390]  Loss: 0.7827 (0.628)  Acc@1: 70.3125 (78.3242)  Acc@5: 95.3125 (98.7550)LR: 2.416e-02
Train: 6 [ 300/390]  Loss: 0.5065 (0.632)  Acc@1: 81.2500 (78.1665)  Acc@5: 100.0000 (98.7490)LR: 2.416e-02
Train: 6 [ 350/390]  Loss: 0.6357 (0.636)  Acc@1: 79.6875 (78.0582)  Acc@5: 100.0000 (98.7135)LR: 2.416e-02
Train: 6 [ 390/390]  Loss: 0.7225 (0.637)  Acc@1: 77.5000 (78.0120)  Acc@5: 100.0000 (98.7160)LR: 2.416e-02
train_acc 78.012000
Valid: 6 [   0/390]  Loss: 0.4662 (0.466)  Acc@1: 84.3750 (84.3750)  Acc@5: 98.4375 (98.4375)
Valid: 6 [  50/390]  Loss: 0.8058 (0.620)  Acc@1: 79.6875 (78.2475)  Acc@5: 95.3125 (98.7132)
Valid: 6 [ 100/390]  Loss: 0.7923 (0.633)  Acc@1: 75.0000 (77.9548)  Acc@5: 100.0000 (98.5303)
Valid: 6 [ 150/390]  Loss: 0.6695 (0.632)  Acc@1: 78.1250 (78.0733)  Acc@5: 96.8750 (98.4892)
Valid: 6 [ 200/390]  Loss: 0.8844 (0.634)  Acc@1: 64.0625 (77.9151)  Acc@5: 98.4375 (98.4764)
Valid: 6 [ 250/390]  Loss: 0.5198 (0.629)  Acc@1: 84.3750 (77.9818)  Acc@5: 96.8750 (98.5620)
Valid: 6 [ 300/390]  Loss: 0.7027 (0.633)  Acc@1: 70.3125 (77.9848)  Acc@5: 100.0000 (98.5154)
Valid: 6 [ 350/390]  Loss: 0.4574 (0.633)  Acc@1: 81.2500 (77.8846)  Acc@5: 96.8750 (98.5621)
Valid: 6 [ 390/390]  Loss: 0.5511 (0.635)  Acc@1: 85.0000 (77.8440)  Acc@5: 97.5000 (98.5760)
valid_acc 77.844000
epoch = 6   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 0), ('dil_conv_5x5', 2), ('dil_conv_5x5', 4), ('sep_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 1), ('sep_conv_5x5', 3), ('max_pool_3x3', 0), ('sep_conv_5x5', 3), ('sep_conv_5x5', 4)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1286, 0.1247, 0.0963, 0.1117, 0.1367, 0.1448, 0.1294, 0.1279],
        [0.1332, 0.1174, 0.0946, 0.1081, 0.1461, 0.1377, 0.1344, 0.1286],
        [0.1327, 0.1267, 0.0986, 0.1136, 0.1445, 0.1305, 0.1207, 0.1327],
        [0.1355, 0.1189, 0.0986, 0.1119, 0.1318, 0.1396, 0.1323, 0.1314],
        [0.1380, 0.1132, 0.0965, 0.1153, 0.1341, 0.1356, 0.1338, 0.1335],
        [0.1374, 0.1243, 0.0984, 0.1128, 0.1428, 0.1341, 0.1236, 0.1267],
        [0.1379, 0.1171, 0.0978, 0.1090, 0.1377, 0.1376, 0.1338, 0.1290],
        [0.1470, 0.1093, 0.0958, 0.1138, 0.1316, 0.1343, 0.1294, 0.1389],
        [0.1593, 0.1055, 0.0954, 0.1104, 0.1320, 0.1386, 0.1293, 0.1294],
        [0.1364, 0.1275, 0.1017, 0.1128, 0.1366, 0.1322, 0.1259, 0.1269],
        [0.1496, 0.1180, 0.0998, 0.1097, 0.1335, 0.1333, 0.1263, 0.1297],
        [0.1442, 0.1088, 0.0959, 0.1091, 0.1403, 0.1394, 0.1323, 0.1300],
        [0.1546, 0.1037, 0.0940, 0.1040, 0.1318, 0.1364, 0.1371, 0.1382],
        [0.1582, 0.1013, 0.0925, 0.0977, 0.1309, 0.1401, 0.1387, 0.1406]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1224, 0.1326, 0.1208, 0.1190, 0.1364, 0.1327, 0.1136, 0.1226],
        [0.1276, 0.1297, 0.1172, 0.1222, 0.1300, 0.1253, 0.1233, 0.1247],
        [0.1230, 0.1318, 0.1197, 0.1262, 0.1265, 0.1232, 0.1237, 0.1258],
        [0.1253, 0.1322, 0.1199, 0.1263, 0.1247, 0.1232, 0.1225, 0.1257],
        [0.1218, 0.1199, 0.1028, 0.1197, 0.1344, 0.1305, 0.1337, 0.1372],
        [0.1222, 0.1366, 0.1242, 0.1219, 0.1230, 0.1247, 0.1247, 0.1229],
        [0.1282, 0.1268, 0.1149, 0.1214, 0.1276, 0.1276, 0.1246, 0.1290],
        [0.1299, 0.1206, 0.1029, 0.1244, 0.1335, 0.1353, 0.1220, 0.1314],
        [0.1304, 0.1184, 0.1073, 0.1229, 0.1285, 0.1367, 0.1299, 0.1258],
        [0.1196, 0.1347, 0.1232, 0.1247, 0.1209, 0.1260, 0.1243, 0.1266],
        [0.1260, 0.1261, 0.1150, 0.1203, 0.1218, 0.1372, 0.1213, 0.1323],
        [0.1312, 0.1199, 0.1026, 0.1238, 0.1327, 0.1379, 0.1188, 0.1330],
        [0.1267, 0.1172, 0.1041, 0.1211, 0.1266, 0.1410, 0.1317, 0.1317],
        [0.1290, 0.1138, 0.1020, 0.1132, 0.1319, 0.1388, 0.1356, 0.1357]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 7 [   0/390]  Loss: 0.4487 (0.449)  Acc@1: 81.2500 (81.2500)  Acc@5: 100.0000 (100.0000)LR: 2.386e-02
Train: 7 [  50/390]  Loss: 0.8925 (0.607)  Acc@1: 65.6250 (79.0441)  Acc@5: 98.4375 (99.1422)LR: 2.386e-02
Train: 7 [ 100/390]  Loss: 0.5601 (0.620)  Acc@1: 78.1250 (78.3106)  Acc@5: 100.0000 (99.0099)LR: 2.386e-02
Train: 7 [ 150/390]  Loss: 0.5064 (0.618)  Acc@1: 79.6875 (78.2078)  Acc@5: 100.0000 (99.0273)LR: 2.386e-02
Train: 7 [ 200/390]  Loss: 0.4300 (0.621)  Acc@1: 82.8125 (78.2183)  Acc@5: 100.0000 (98.9817)LR: 2.386e-02
Train: 7 [ 250/390]  Loss: 0.6346 (0.613)  Acc@1: 78.1250 (78.4674)  Acc@5: 100.0000 (98.9480)LR: 2.386e-02
Train: 7 [ 300/390]  Loss: 0.5412 (0.609)  Acc@1: 84.3750 (78.5247)  Acc@5: 98.4375 (98.9151)LR: 2.386e-02
Train: 7 [ 350/390]  Loss: 0.5895 (0.604)  Acc@1: 78.1250 (78.7349)  Acc@5: 100.0000 (98.9094)LR: 2.386e-02
Train: 7 [ 390/390]  Loss: 0.6663 (0.603)  Acc@1: 75.0000 (78.8160)  Acc@5: 97.5000 (98.9320)LR: 2.386e-02
train_acc 78.816000
Valid: 7 [   0/390]  Loss: 0.5238 (0.524)  Acc@1: 79.6875 (79.6875)  Acc@5: 100.0000 (100.0000)
Valid: 7 [  50/390]  Loss: 0.5958 (0.619)  Acc@1: 81.2500 (79.0748)  Acc@5: 98.4375 (98.5907)
Valid: 7 [ 100/390]  Loss: 0.4409 (0.620)  Acc@1: 85.9375 (78.8985)  Acc@5: 100.0000 (98.7160)
Valid: 7 [ 150/390]  Loss: 0.6083 (0.632)  Acc@1: 76.5625 (78.5596)  Acc@5: 98.4375 (98.6134)
Valid: 7 [ 200/390]  Loss: 0.5562 (0.631)  Acc@1: 81.2500 (78.4282)  Acc@5: 100.0000 (98.6163)
Valid: 7 [ 250/390]  Loss: 0.8476 (0.633)  Acc@1: 75.0000 (78.3242)  Acc@5: 93.7500 (98.6367)
Valid: 7 [ 300/390]  Loss: 0.5653 (0.633)  Acc@1: 84.3750 (78.4001)  Acc@5: 96.8750 (98.6296)
Valid: 7 [ 350/390]  Loss: 0.3921 (0.632)  Acc@1: 85.9375 (78.4277)  Acc@5: 98.4375 (98.5978)
Valid: 7 [ 390/390]  Loss: 0.5444 (0.635)  Acc@1: 75.0000 (78.2800)  Acc@5: 100.0000 (98.6040)
valid_acc 78.280000
epoch = 7   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 0), ('sep_conv_5x5', 3), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_5x5', 3), ('sep_conv_5x5', 3), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1295, 0.1234, 0.0935, 0.1105, 0.1389, 0.1458, 0.1302, 0.1282],
        [0.1339, 0.1147, 0.0914, 0.1059, 0.1497, 0.1401, 0.1357, 0.1285],
        [0.1346, 0.1247, 0.0960, 0.1124, 0.1462, 0.1314, 0.1210, 0.1338],
        [0.1373, 0.1166, 0.0964, 0.1111, 0.1322, 0.1415, 0.1326, 0.1324],
        [0.1407, 0.1100, 0.0932, 0.1140, 0.1356, 0.1372, 0.1356, 0.1338],
        [0.1402, 0.1223, 0.0956, 0.1118, 0.1442, 0.1346, 0.1240, 0.1272],
        [0.1411, 0.1147, 0.0954, 0.1082, 0.1403, 0.1379, 0.1342, 0.1282],
        [0.1513, 0.1061, 0.0925, 0.1128, 0.1326, 0.1356, 0.1296, 0.1395],
        [0.1643, 0.1015, 0.0914, 0.1076, 0.1326, 0.1411, 0.1329, 0.1288],
        [0.1392, 0.1252, 0.0986, 0.1109, 0.1383, 0.1321, 0.1272, 0.1285],
        [0.1543, 0.1146, 0.0968, 0.1079, 0.1344, 0.1350, 0.1263, 0.1307],
        [0.1483, 0.1046, 0.0919, 0.1067, 0.1425, 0.1412, 0.1327, 0.1320],
        [0.1615, 0.0997, 0.0897, 0.1005, 0.1326, 0.1383, 0.1389, 0.1389],
        [0.1665, 0.0969, 0.0883, 0.0940, 0.1314, 0.1416, 0.1392, 0.1421]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1213, 0.1333, 0.1205, 0.1185, 0.1374, 0.1341, 0.1128, 0.1221],
        [0.1283, 0.1288, 0.1158, 0.1223, 0.1310, 0.1249, 0.1248, 0.1241],
        [0.1222, 0.1322, 0.1188, 0.1273, 0.1266, 0.1221, 0.1243, 0.1265],
        [0.1263, 0.1312, 0.1185, 0.1256, 0.1264, 0.1236, 0.1226, 0.1258],
        [0.1203, 0.1190, 0.1011, 0.1190, 0.1355, 0.1326, 0.1355, 0.1371],
        [0.1215, 0.1378, 0.1243, 0.1209, 0.1226, 0.1254, 0.1254, 0.1221],
        [0.1286, 0.1263, 0.1139, 0.1201, 0.1282, 0.1282, 0.1259, 0.1287],
        [0.1294, 0.1198, 0.1013, 0.1239, 0.1352, 0.1367, 0.1225, 0.1312],
        [0.1311, 0.1176, 0.1060, 0.1232, 0.1283, 0.1372, 0.1299, 0.1268],
        [0.1198, 0.1341, 0.1219, 0.1251, 0.1215, 0.1257, 0.1243, 0.1276],
        [0.1271, 0.1242, 0.1128, 0.1205, 0.1206, 0.1381, 0.1211, 0.1355],
        [0.1314, 0.1184, 0.1001, 0.1230, 0.1337, 0.1423, 0.1183, 0.1329],
        [0.1279, 0.1153, 0.1018, 0.1206, 0.1261, 0.1424, 0.1325, 0.1334],
        [0.1299, 0.1120, 0.0999, 0.1119, 0.1320, 0.1392, 0.1381, 0.1371]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 8 [   0/390]  Loss: 0.6894 (0.689)  Acc@1: 78.1250 (78.1250)  Acc@5: 98.4375 (98.4375)LR: 2.352e-02
Train: 8 [  50/390]  Loss: 0.5285 (0.564)  Acc@1: 81.2500 (81.4338)  Acc@5: 100.0000 (99.2341)LR: 2.352e-02
Train: 8 [ 100/390]  Loss: 0.3650 (0.562)  Acc@1: 85.9375 (80.6002)  Acc@5: 100.0000 (99.1027)LR: 2.352e-02
Train: 8 [ 150/390]  Loss: 0.4150 (0.555)  Acc@1: 89.0625 (80.7637)  Acc@5: 100.0000 (99.0687)LR: 2.352e-02
Train: 8 [ 200/390]  Loss: 0.6321 (0.568)  Acc@1: 75.0000 (80.1695)  Acc@5: 100.0000 (99.0516)LR: 2.352e-02
Train: 8 [ 250/390]  Loss: 0.5823 (0.566)  Acc@1: 75.0000 (80.1855)  Acc@5: 100.0000 (99.0662)LR: 2.352e-02
Train: 8 [ 300/390]  Loss: 0.4606 (0.566)  Acc@1: 79.6875 (80.2533)  Acc@5: 100.0000 (99.0449)LR: 2.352e-02
Train: 8 [ 350/390]  Loss: 0.8928 (0.570)  Acc@1: 71.8750 (80.0748)  Acc@5: 96.8750 (99.0652)LR: 2.352e-02
Train: 8 [ 390/390]  Loss: 0.6065 (0.569)  Acc@1: 72.5000 (80.1240)  Acc@5: 100.0000 (99.0560)LR: 2.352e-02
train_acc 80.124000
Valid: 8 [   0/390]  Loss: 0.6700 (0.670)  Acc@1: 78.1250 (78.1250)  Acc@5: 96.8750 (96.8750)
Valid: 8 [  50/390]  Loss: 0.5133 (0.601)  Acc@1: 81.2500 (79.9020)  Acc@5: 100.0000 (98.8051)
Valid: 8 [ 100/390]  Loss: 0.6003 (0.610)  Acc@1: 81.2500 (79.3472)  Acc@5: 100.0000 (98.7778)
Valid: 8 [ 150/390]  Loss: 0.7020 (0.604)  Acc@1: 76.5625 (79.2943)  Acc@5: 100.0000 (98.8307)
Valid: 8 [ 200/390]  Loss: 0.3901 (0.609)  Acc@1: 84.3750 (79.0967)  Acc@5: 100.0000 (98.8262)
Valid: 8 [ 250/390]  Loss: 0.7042 (0.616)  Acc@1: 78.1250 (78.9778)  Acc@5: 98.4375 (98.7612)
Valid: 8 [ 300/390]  Loss: 0.7425 (0.616)  Acc@1: 70.3125 (78.8414)  Acc@5: 96.8750 (98.7542)
Valid: 8 [ 350/390]  Loss: 0.4988 (0.611)  Acc@1: 81.2500 (78.8818)  Acc@5: 96.8750 (98.7981)
Valid: 8 [ 390/390]  Loss: 0.6812 (0.612)  Acc@1: 77.5000 (78.8640)  Acc@5: 97.5000 (98.7680)
valid_acc 78.864000
epoch = 8   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('dil_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_5x5', 2), ('sep_conv_5x5', 2), ('sep_conv_5x5', 3)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1306, 0.1214, 0.0909, 0.1091, 0.1425, 0.1472, 0.1302, 0.1282],
        [0.1349, 0.1123, 0.0891, 0.1050, 0.1540, 0.1422, 0.1350, 0.1275],
        [0.1367, 0.1232, 0.0938, 0.1114, 0.1481, 0.1317, 0.1202, 0.1350],
        [0.1380, 0.1146, 0.0944, 0.1106, 0.1324, 0.1434, 0.1331, 0.1336],
        [0.1451, 0.1078, 0.0903, 0.1132, 0.1356, 0.1374, 0.1367, 0.1339],
        [0.1433, 0.1198, 0.0927, 0.1101, 0.1449, 0.1350, 0.1264, 0.1279],
        [0.1424, 0.1121, 0.0929, 0.1066, 0.1443, 0.1383, 0.1338, 0.1296],
        [0.1559, 0.1032, 0.0894, 0.1114, 0.1332, 0.1357, 0.1296, 0.1416],
        [0.1720, 0.0973, 0.0875, 0.1046, 0.1325, 0.1423, 0.1336, 0.1302],
        [0.1433, 0.1232, 0.0963, 0.1099, 0.1387, 0.1311, 0.1269, 0.1307],
        [0.1593, 0.1125, 0.0945, 0.1067, 0.1355, 0.1359, 0.1259, 0.1297],
        [0.1513, 0.1018, 0.0887, 0.1039, 0.1436, 0.1434, 0.1347, 0.1326],
        [0.1678, 0.0958, 0.0857, 0.0973, 0.1336, 0.1400, 0.1410, 0.1387],
        [0.1748, 0.0931, 0.0846, 0.0904, 0.1326, 0.1418, 0.1407, 0.1420]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1211, 0.1340, 0.1199, 0.1185, 0.1380, 0.1347, 0.1127, 0.1210],
        [0.1278, 0.1281, 0.1141, 0.1232, 0.1338, 0.1243, 0.1249, 0.1238],
        [0.1211, 0.1335, 0.1185, 0.1278, 0.1278, 0.1212, 0.1250, 0.1251],
        [0.1274, 0.1308, 0.1171, 0.1274, 0.1267, 0.1235, 0.1225, 0.1246],
        [0.1199, 0.1178, 0.0995, 0.1186, 0.1353, 0.1352, 0.1371, 0.1365],
        [0.1211, 0.1394, 0.1242, 0.1207, 0.1221, 0.1250, 0.1255, 0.1219],
        [0.1295, 0.1268, 0.1137, 0.1196, 0.1283, 0.1275, 0.1263, 0.1283],
        [0.1295, 0.1187, 0.1003, 0.1243, 0.1352, 0.1374, 0.1229, 0.1316],
        [0.1324, 0.1162, 0.1046, 0.1239, 0.1296, 0.1366, 0.1307, 0.1260],
        [0.1192, 0.1342, 0.1212, 0.1256, 0.1214, 0.1268, 0.1239, 0.1278],
        [0.1273, 0.1245, 0.1120, 0.1213, 0.1203, 0.1379, 0.1209, 0.1360],
        [0.1316, 0.1175, 0.0992, 0.1238, 0.1349, 0.1447, 0.1171, 0.1311],
        [0.1283, 0.1144, 0.1005, 0.1209, 0.1247, 0.1443, 0.1332, 0.1337],
        [0.1299, 0.1102, 0.0981, 0.1108, 0.1326, 0.1382, 0.1404, 0.1397]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 9 [   0/390]  Loss: 0.4902 (0.490)  Acc@1: 82.8125 (82.8125)  Acc@5: 98.4375 (98.4375)LR: 2.313e-02
Train: 9 [  50/390]  Loss: 0.7358 (0.555)  Acc@1: 71.8750 (80.3002)  Acc@5: 98.4375 (99.0196)LR: 2.313e-02
Train: 9 [ 100/390]  Loss: 0.3903 (0.558)  Acc@1: 89.0625 (80.6776)  Acc@5: 96.8750 (99.0718)LR: 2.313e-02
Train: 9 [ 150/390]  Loss: 0.5468 (0.544)  Acc@1: 82.8125 (80.9810)  Acc@5: 98.4375 (99.1618)LR: 2.313e-02
Train: 9 [ 200/390]  Loss: 0.3579 (0.542)  Acc@1: 87.5000 (81.0479)  Acc@5: 100.0000 (99.1682)LR: 2.313e-02
Train: 9 [ 250/390]  Loss: 0.4857 (0.545)  Acc@1: 82.8125 (80.9014)  Acc@5: 98.4375 (99.1347)LR: 2.313e-02
Train: 9 [ 300/390]  Loss: 0.5154 (0.541)  Acc@1: 79.6875 (80.9697)  Acc@5: 98.4375 (99.1746)LR: 2.313e-02
Train: 9 [ 350/390]  Loss: 0.3725 (0.536)  Acc@1: 87.5000 (81.1432)  Acc@5: 98.4375 (99.1720)LR: 2.313e-02
Train: 9 [ 390/390]  Loss: 1.183 (0.535)  Acc@1: 62.5000 (81.2520)  Acc@5: 92.5000 (99.1480)LR: 2.313e-02
train_acc 81.252000
Valid: 9 [   0/390]  Loss: 0.4776 (0.478)  Acc@1: 78.1250 (78.1250)  Acc@5: 98.4375 (98.4375)
Valid: 9 [  50/390]  Loss: 0.6243 (0.604)  Acc@1: 76.5625 (79.5037)  Acc@5: 100.0000 (98.7132)
Valid: 9 [ 100/390]  Loss: 0.5504 (0.607)  Acc@1: 78.1250 (79.0842)  Acc@5: 100.0000 (98.8707)
Valid: 9 [ 150/390]  Loss: 0.6171 (0.591)  Acc@1: 79.6875 (79.8117)  Acc@5: 100.0000 (98.8307)
Valid: 9 [ 200/390]  Loss: 0.4909 (0.586)  Acc@1: 84.3750 (80.2006)  Acc@5: 98.4375 (98.8106)
Valid: 9 [ 250/390]  Loss: 0.4609 (0.592)  Acc@1: 87.5000 (79.8867)  Acc@5: 98.4375 (98.7674)
Valid: 9 [ 300/390]  Loss: 0.6287 (0.590)  Acc@1: 75.0000 (79.9886)  Acc@5: 98.4375 (98.7957)
Valid: 9 [ 350/390]  Loss: 0.5621 (0.586)  Acc@1: 79.6875 (80.0926)  Acc@5: 100.0000 (98.7981)
Valid: 9 [ 390/390]  Loss: 0.5161 (0.588)  Acc@1: 82.5000 (79.9560)  Acc@5: 100.0000 (98.7760)
valid_acc 79.956000
epoch = 9   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_5x5', 2), ('dil_conv_3x3', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('dil_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_5x5', 3), ('sep_conv_5x5', 2), ('sep_conv_5x5', 3)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1305, 0.1186, 0.0880, 0.1081, 0.1476, 0.1486, 0.1308, 0.1277],
        [0.1363, 0.1090, 0.0857, 0.1028, 0.1577, 0.1449, 0.1354, 0.1282],
        [0.1391, 0.1217, 0.0915, 0.1110, 0.1493, 0.1314, 0.1212, 0.1348],
        [0.1403, 0.1119, 0.0916, 0.1094, 0.1329, 0.1444, 0.1345, 0.1349],
        [0.1458, 0.1054, 0.0873, 0.1116, 0.1384, 0.1395, 0.1374, 0.1346],
        [0.1467, 0.1182, 0.0905, 0.1100, 0.1453, 0.1355, 0.1265, 0.1274],
        [0.1456, 0.1099, 0.0905, 0.1060, 0.1457, 0.1391, 0.1323, 0.1310],
        [0.1604, 0.1006, 0.0867, 0.1113, 0.1341, 0.1360, 0.1287, 0.1423],
        [0.1789, 0.0940, 0.0845, 0.1035, 0.1317, 0.1437, 0.1342, 0.1295],
        [0.1469, 0.1201, 0.0929, 0.1084, 0.1407, 0.1309, 0.1279, 0.1321],
        [0.1660, 0.1094, 0.0914, 0.1053, 0.1363, 0.1368, 0.1252, 0.1296],
        [0.1553, 0.0984, 0.0850, 0.1020, 0.1439, 0.1451, 0.1365, 0.1338],
        [0.1747, 0.0921, 0.0820, 0.0948, 0.1334, 0.1405, 0.1429, 0.1396],
        [0.1837, 0.0884, 0.0800, 0.0863, 0.1339, 0.1418, 0.1430, 0.1430]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1203, 0.1348, 0.1193, 0.1180, 0.1391, 0.1352, 0.1124, 0.1209],
        [0.1277, 0.1305, 0.1149, 0.1230, 0.1338, 0.1232, 0.1258, 0.1210],
        [0.1211, 0.1344, 0.1179, 0.1286, 0.1271, 0.1193, 0.1264, 0.1252],
        [0.1261, 0.1323, 0.1176, 0.1280, 0.1267, 0.1229, 0.1234, 0.1230],
        [0.1207, 0.1163, 0.0970, 0.1188, 0.1366, 0.1365, 0.1372, 0.1369],
        [0.1212, 0.1408, 0.1245, 0.1207, 0.1220, 0.1242, 0.1259, 0.1208],
        [0.1289, 0.1292, 0.1148, 0.1185, 0.1287, 0.1263, 0.1259, 0.1277],
        [0.1306, 0.1180, 0.0987, 0.1262, 0.1362, 0.1369, 0.1224, 0.1309],
        [0.1342, 0.1154, 0.1028, 0.1252, 0.1285, 0.1370, 0.1309, 0.1262],
        [0.1189, 0.1350, 0.1206, 0.1252, 0.1212, 0.1264, 0.1226, 0.1300],
        [0.1257, 0.1267, 0.1130, 0.1219, 0.1189, 0.1371, 0.1207, 0.1359],
        [0.1322, 0.1165, 0.0972, 0.1252, 0.1357, 0.1464, 0.1159, 0.1309],
        [0.1290, 0.1128, 0.0986, 0.1215, 0.1250, 0.1451, 0.1333, 0.1347],
        [0.1309, 0.1083, 0.0955, 0.1098, 0.1326, 0.1396, 0.1424, 0.1408]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 10 [   0/390]  Loss: 0.7609 (0.761)  Acc@1: 75.0000 (75.0000)  Acc@5: 96.8750 (96.8750)LR: 2.271e-02
Train: 10 [  50/390]  Loss: 0.4387 (0.534)  Acc@1: 84.3750 (81.4338)  Acc@5: 100.0000 (99.0196)LR: 2.271e-02
Train: 10 [ 100/390]  Loss: 0.6012 (0.522)  Acc@1: 84.3750 (82.3639)  Acc@5: 96.8750 (99.1027)LR: 2.271e-02
Train: 10 [ 150/390]  Loss: 0.4802 (0.515)  Acc@1: 85.9375 (82.4503)  Acc@5: 98.4375 (99.1204)LR: 2.271e-02
Train: 10 [ 200/390]  Loss: 0.3624 (0.513)  Acc@1: 92.1875 (82.3383)  Acc@5: 98.4375 (99.1682)LR: 2.271e-02
Train: 10 [ 250/390]  Loss: 0.7736 (0.518)  Acc@1: 73.4375 (82.0655)  Acc@5: 98.4375 (99.1534)LR: 2.271e-02
Train: 10 [ 300/390]  Loss: 0.5781 (0.513)  Acc@1: 76.5625 (82.2051)  Acc@5: 100.0000 (99.1539)LR: 2.271e-02
Train: 10 [ 350/390]  Loss: 0.3786 (0.514)  Acc@1: 82.8125 (82.1581)  Acc@5: 100.0000 (99.1765)LR: 2.271e-02
Train: 10 [ 390/390]  Loss: 0.4518 (0.515)  Acc@1: 87.5000 (82.0800)  Acc@5: 100.0000 (99.1840)LR: 2.271e-02
train_acc 82.080000
Valid: 10 [   0/390]  Loss: 0.6344 (0.634)  Acc@1: 79.6875 (79.6875)  Acc@5: 100.0000 (100.0000)
Valid: 10 [  50/390]  Loss: 0.4828 (0.569)  Acc@1: 84.3750 (81.0355)  Acc@5: 96.8750 (99.0196)
Valid: 10 [ 100/390]  Loss: 0.8302 (0.591)  Acc@1: 76.5625 (80.3837)  Acc@5: 98.4375 (98.9480)
Valid: 10 [ 150/390]  Loss: 0.3566 (0.586)  Acc@1: 89.0625 (80.5671)  Acc@5: 98.4375 (98.9238)
Valid: 10 [ 200/390]  Loss: 0.4314 (0.575)  Acc@1: 82.8125 (80.7758)  Acc@5: 98.4375 (98.9350)
Valid: 10 [ 250/390]  Loss: 0.4828 (0.574)  Acc@1: 82.8125 (80.7209)  Acc@5: 100.0000 (98.9604)
Valid: 10 [ 300/390]  Loss: 0.7001 (0.572)  Acc@1: 79.6875 (80.6271)  Acc@5: 96.8750 (98.9151)
Valid: 10 [ 350/390]  Loss: 0.5444 (0.570)  Acc@1: 89.0625 (80.7915)  Acc@5: 98.4375 (98.9005)
Valid: 10 [ 390/390]  Loss: 0.5592 (0.571)  Acc@1: 77.5000 (80.7240)  Acc@5: 100.0000 (98.9040)
valid_acc 80.724000
epoch = 10   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('sep_conv_3x3', 2), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 2), ('dil_conv_3x3', 4)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1318, 0.1169, 0.0856, 0.1080, 0.1496, 0.1485, 0.1322, 0.1274],
        [0.1371, 0.1061, 0.0824, 0.1007, 0.1612, 0.1483, 0.1350, 0.1293],
        [0.1404, 0.1208, 0.0895, 0.1114, 0.1514, 0.1303, 0.1208, 0.1354],
        [0.1421, 0.1095, 0.0885, 0.1080, 0.1343, 0.1461, 0.1356, 0.1359],
        [0.1494, 0.1035, 0.0844, 0.1108, 0.1385, 0.1402, 0.1384, 0.1350],
        [0.1500, 0.1171, 0.0885, 0.1106, 0.1452, 0.1338, 0.1273, 0.1276],
        [0.1492, 0.1070, 0.0872, 0.1040, 0.1472, 0.1406, 0.1322, 0.1326],
        [0.1653, 0.0975, 0.0832, 0.1099, 0.1351, 0.1368, 0.1286, 0.1435],
        [0.1862, 0.0904, 0.0807, 0.1011, 0.1327, 0.1453, 0.1348, 0.1287],
        [0.1508, 0.1179, 0.0901, 0.1080, 0.1398, 0.1316, 0.1293, 0.1325],
        [0.1719, 0.1062, 0.0878, 0.1030, 0.1386, 0.1380, 0.1245, 0.1300],
        [0.1599, 0.0948, 0.0809, 0.0995, 0.1457, 0.1447, 0.1394, 0.1351],
        [0.1823, 0.0881, 0.0779, 0.0917, 0.1333, 0.1418, 0.1455, 0.1394],
        [0.1914, 0.0842, 0.0757, 0.0823, 0.1354, 0.1426, 0.1440, 0.1443]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1195, 0.1367, 0.1196, 0.1171, 0.1396, 0.1360, 0.1117, 0.1197],
        [0.1279, 0.1299, 0.1127, 0.1235, 0.1357, 0.1239, 0.1258, 0.1207],
        [0.1201, 0.1354, 0.1176, 0.1278, 0.1287, 0.1188, 0.1262, 0.1254],
        [0.1263, 0.1318, 0.1161, 0.1285, 0.1267, 0.1236, 0.1234, 0.1236],
        [0.1221, 0.1147, 0.0948, 0.1189, 0.1383, 0.1369, 0.1373, 0.1371],
        [0.1205, 0.1434, 0.1254, 0.1208, 0.1202, 0.1234, 0.1257, 0.1207],
        [0.1288, 0.1294, 0.1138, 0.1196, 0.1284, 0.1269, 0.1258, 0.1273],
        [0.1308, 0.1180, 0.0975, 0.1278, 0.1373, 0.1357, 0.1225, 0.1304],
        [0.1342, 0.1144, 0.1016, 0.1265, 0.1289, 0.1362, 0.1320, 0.1262],
        [0.1176, 0.1366, 0.1209, 0.1247, 0.1208, 0.1275, 0.1217, 0.1300],
        [0.1261, 0.1261, 0.1114, 0.1226, 0.1189, 0.1369, 0.1213, 0.1368],
        [0.1323, 0.1155, 0.0956, 0.1262, 0.1366, 0.1466, 0.1160, 0.1310],
        [0.1306, 0.1113, 0.0971, 0.1228, 0.1262, 0.1453, 0.1326, 0.1340],
        [0.1310, 0.1061, 0.0934, 0.1092, 0.1331, 0.1394, 0.1453, 0.1426]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 11 [   0/390]  Loss: 0.4707 (0.471)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)LR: 2.225e-02
Train: 11 [  50/390]  Loss: 0.3159 (0.466)  Acc@1: 89.0625 (84.2831)  Acc@5: 100.0000 (99.2034)LR: 2.225e-02
Train: 11 [ 100/390]  Loss: 0.3943 (0.478)  Acc@1: 85.9375 (83.8645)  Acc@5: 100.0000 (99.2110)LR: 2.225e-02
Train: 11 [ 150/390]  Loss: 0.4749 (0.489)  Acc@1: 81.2500 (83.3713)  Acc@5: 100.0000 (99.2446)LR: 2.225e-02
Train: 11 [ 200/390]  Loss: 0.6479 (0.493)  Acc@1: 79.6875 (83.0846)  Acc@5: 100.0000 (99.2382)LR: 2.225e-02
Train: 11 [ 250/390]  Loss: 0.4018 (0.496)  Acc@1: 79.6875 (83.0304)  Acc@5: 100.0000 (99.2281)LR: 2.225e-02
Train: 11 [ 300/390]  Loss: 0.6548 (0.502)  Acc@1: 78.1250 (82.8281)  Acc@5: 100.0000 (99.2058)LR: 2.225e-02
Train: 11 [ 350/390]  Loss: 0.3425 (0.501)  Acc@1: 89.0625 (82.8837)  Acc@5: 98.4375 (99.2343)LR: 2.225e-02
Train: 11 [ 390/390]  Loss: 0.6726 (0.500)  Acc@1: 75.0000 (82.8920)  Acc@5: 97.5000 (99.2320)LR: 2.225e-02
train_acc 82.892000
Valid: 11 [   0/390]  Loss: 0.3569 (0.357)  Acc@1: 84.3750 (84.3750)  Acc@5: 100.0000 (100.0000)
Valid: 11 [  50/390]  Loss: 0.4214 (0.527)  Acc@1: 85.9375 (82.6593)  Acc@5: 100.0000 (99.0196)
Valid: 11 [ 100/390]  Loss: 0.5400 (0.534)  Acc@1: 79.6875 (81.8379)  Acc@5: 100.0000 (98.9635)
Valid: 11 [ 150/390]  Loss: 0.6808 (0.534)  Acc@1: 76.5625 (81.9433)  Acc@5: 98.4375 (98.9652)
Valid: 11 [ 200/390]  Loss: 0.5090 (0.531)  Acc@1: 78.1250 (81.9729)  Acc@5: 100.0000 (98.9972)
Valid: 11 [ 250/390]  Loss: 0.6125 (0.528)  Acc@1: 81.2500 (82.0344)  Acc@5: 96.8750 (99.0289)
Valid: 11 [ 300/390]  Loss: 0.5202 (0.526)  Acc@1: 81.2500 (82.0131)  Acc@5: 100.0000 (99.0345)
Valid: 11 [ 350/390]  Loss: 0.5123 (0.526)  Acc@1: 81.2500 (82.1136)  Acc@5: 100.0000 (99.0296)
Valid: 11 [ 390/390]  Loss: 0.6162 (0.521)  Acc@1: 82.5000 (82.3120)  Acc@5: 97.5000 (99.0600)
valid_acc 82.312000
epoch = 11   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 2), ('dil_conv_3x3', 4)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1338, 0.1137, 0.0834, 0.1073, 0.1516, 0.1479, 0.1339, 0.1284],
        [0.1365, 0.1038, 0.0799, 0.0992, 0.1665, 0.1498, 0.1346, 0.1297],
        [0.1426, 0.1175, 0.0874, 0.1109, 0.1525, 0.1322, 0.1202, 0.1368],
        [0.1428, 0.1075, 0.0865, 0.1073, 0.1354, 0.1489, 0.1362, 0.1355],
        [0.1532, 0.1004, 0.0820, 0.1104, 0.1395, 0.1407, 0.1379, 0.1359],
        [0.1538, 0.1142, 0.0863, 0.1103, 0.1444, 0.1351, 0.1282, 0.1277],
        [0.1512, 0.1046, 0.0849, 0.1026, 0.1489, 0.1432, 0.1317, 0.1330],
        [0.1710, 0.0944, 0.0808, 0.1096, 0.1357, 0.1368, 0.1283, 0.1435],
        [0.1928, 0.0868, 0.0776, 0.0988, 0.1347, 0.1452, 0.1359, 0.1282],
        [0.1555, 0.1145, 0.0880, 0.1076, 0.1415, 0.1294, 0.1306, 0.1328],
        [0.1771, 0.1041, 0.0859, 0.1018, 0.1394, 0.1386, 0.1241, 0.1289],
        [0.1647, 0.0912, 0.0780, 0.0980, 0.1457, 0.1454, 0.1406, 0.1366],
        [0.1893, 0.0843, 0.0749, 0.0893, 0.1325, 0.1425, 0.1464, 0.1409],
        [0.1995, 0.0805, 0.0725, 0.0791, 0.1350, 0.1427, 0.1440, 0.1467]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1186, 0.1366, 0.1197, 0.1165, 0.1396, 0.1373, 0.1120, 0.1196],
        [0.1279, 0.1288, 0.1114, 0.1239, 0.1369, 0.1220, 0.1280, 0.1210],
        [0.1191, 0.1346, 0.1169, 0.1297, 0.1291, 0.1188, 0.1267, 0.1251],
        [0.1262, 0.1309, 0.1151, 0.1288, 0.1274, 0.1246, 0.1237, 0.1233],
        [0.1213, 0.1129, 0.0935, 0.1185, 0.1397, 0.1384, 0.1374, 0.1382],
        [0.1202, 0.1435, 0.1261, 0.1198, 0.1202, 0.1246, 0.1257, 0.1199],
        [0.1292, 0.1288, 0.1138, 0.1194, 0.1281, 0.1267, 0.1263, 0.1277],
        [0.1318, 0.1159, 0.0970, 0.1288, 0.1368, 0.1366, 0.1226, 0.1304],
        [0.1354, 0.1126, 0.1010, 0.1277, 0.1294, 0.1364, 0.1316, 0.1259],
        [0.1175, 0.1365, 0.1212, 0.1251, 0.1209, 0.1284, 0.1215, 0.1290],
        [0.1257, 0.1247, 0.1103, 0.1231, 0.1189, 0.1376, 0.1224, 0.1373],
        [0.1325, 0.1136, 0.0947, 0.1269, 0.1371, 0.1485, 0.1149, 0.1317],
        [0.1311, 0.1092, 0.0963, 0.1242, 0.1262, 0.1454, 0.1323, 0.1353],
        [0.1322, 0.1037, 0.0919, 0.1085, 0.1325, 0.1399, 0.1468, 0.1446]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 12 [   0/390]  Loss: 0.3784 (0.378)  Acc@1: 87.5000 (87.5000)  Acc@5: 98.4375 (98.4375)LR: 2.175e-02
Train: 12 [  50/390]  Loss: 0.3472 (0.470)  Acc@1: 87.5000 (83.7316)  Acc@5: 98.4375 (99.2034)LR: 2.175e-02
Train: 12 [ 100/390]  Loss: 0.4987 (0.471)  Acc@1: 87.5000 (83.8181)  Acc@5: 98.4375 (99.1801)LR: 2.175e-02
Train: 12 [ 150/390]  Loss: 0.5553 (0.471)  Acc@1: 79.6875 (83.5679)  Acc@5: 96.8750 (99.1825)LR: 2.175e-02
Train: 12 [ 200/390]  Loss: 0.5861 (0.472)  Acc@1: 78.1250 (83.6132)  Acc@5: 100.0000 (99.1838)LR: 2.175e-02
Train: 12 [ 250/390]  Loss: 0.4198 (0.473)  Acc@1: 90.6250 (83.6404)  Acc@5: 96.8750 (99.1783)LR: 2.175e-02
Train: 12 [ 300/390]  Loss: 0.3224 (0.472)  Acc@1: 90.6250 (83.7002)  Acc@5: 100.0000 (99.2110)LR: 2.175e-02
Train: 12 [ 350/390]  Loss: 0.5500 (0.473)  Acc@1: 81.2500 (83.6182)  Acc@5: 100.0000 (99.2121)LR: 2.175e-02
Train: 12 [ 390/390]  Loss: 0.4794 (0.471)  Acc@1: 82.5000 (83.6920)  Acc@5: 100.0000 (99.2320)LR: 2.175e-02
train_acc 83.692000
Valid: 12 [   0/390]  Loss: 0.4929 (0.493)  Acc@1: 81.2500 (81.2500)  Acc@5: 98.4375 (98.4375)
Valid: 12 [  50/390]  Loss: 0.6871 (0.546)  Acc@1: 81.2500 (80.6679)  Acc@5: 98.4375 (99.0196)
Valid: 12 [ 100/390]  Loss: 0.5141 (0.548)  Acc@1: 84.3750 (81.1572)  Acc@5: 98.4375 (98.9171)
Valid: 12 [ 150/390]  Loss: 0.6230 (0.547)  Acc@1: 76.5625 (81.0741)  Acc@5: 93.7500 (98.9342)
Valid: 12 [ 200/390]  Loss: 0.5938 (0.550)  Acc@1: 81.2500 (81.1489)  Acc@5: 96.8750 (98.9039)
Valid: 12 [ 250/390]  Loss: 0.9959 (0.547)  Acc@1: 65.6250 (81.3372)  Acc@5: 98.4375 (98.9168)
Valid: 12 [ 300/390]  Loss: 0.4365 (0.545)  Acc@1: 85.9375 (81.4836)  Acc@5: 100.0000 (98.9410)
Valid: 12 [ 350/390]  Loss: 0.5611 (0.546)  Acc@1: 78.1250 (81.4370)  Acc@5: 100.0000 (98.9450)
Valid: 12 [ 390/390]  Loss: 0.9036 (0.546)  Acc@1: 75.0000 (81.4560)  Acc@5: 97.5000 (98.9120)
valid_acc 81.456000
epoch = 12   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 2), ('dil_conv_3x3', 4)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1369, 0.1106, 0.0816, 0.1069, 0.1538, 0.1467, 0.1340, 0.1296],
        [0.1359, 0.1004, 0.0777, 0.0977, 0.1690, 0.1527, 0.1357, 0.1308],
        [0.1449, 0.1141, 0.0850, 0.1096, 0.1555, 0.1329, 0.1206, 0.1374],
        [0.1438, 0.1046, 0.0845, 0.1061, 0.1370, 0.1517, 0.1371, 0.1352],
        [0.1555, 0.0971, 0.0791, 0.1084, 0.1414, 0.1406, 0.1388, 0.1390],
        [0.1578, 0.1115, 0.0847, 0.1108, 0.1425, 0.1353, 0.1291, 0.1283],
        [0.1554, 0.1019, 0.0833, 0.1025, 0.1500, 0.1434, 0.1306, 0.1329],
        [0.1766, 0.0914, 0.0786, 0.1091, 0.1357, 0.1366, 0.1279, 0.1441],
        [0.1985, 0.0840, 0.0752, 0.0973, 0.1350, 0.1470, 0.1366, 0.1264],
        [0.1607, 0.1119, 0.0864, 0.1078, 0.1420, 0.1291, 0.1292, 0.1328],
        [0.1833, 0.1008, 0.0836, 0.1004, 0.1406, 0.1383, 0.1241, 0.1289],
        [0.1692, 0.0881, 0.0752, 0.0959, 0.1463, 0.1451, 0.1428, 0.1373],
        [0.1959, 0.0812, 0.0720, 0.0866, 0.1332, 0.1406, 0.1485, 0.1421],
        [0.2085, 0.0766, 0.0691, 0.0756, 0.1352, 0.1410, 0.1455, 0.1486]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1176, 0.1366, 0.1194, 0.1161, 0.1408, 0.1378, 0.1122, 0.1195],
        [0.1290, 0.1280, 0.1107, 0.1248, 0.1362, 0.1209, 0.1286, 0.1217],
        [0.1167, 0.1339, 0.1164, 0.1312, 0.1304, 0.1184, 0.1273, 0.1259],
        [0.1264, 0.1298, 0.1143, 0.1278, 0.1288, 0.1240, 0.1256, 0.1232],
        [0.1228, 0.1119, 0.0921, 0.1182, 0.1414, 0.1378, 0.1381, 0.1377],
        [0.1193, 0.1436, 0.1261, 0.1190, 0.1200, 0.1250, 0.1264, 0.1207],
        [0.1298, 0.1288, 0.1138, 0.1185, 0.1290, 0.1266, 0.1264, 0.1271],
        [0.1321, 0.1145, 0.0958, 0.1279, 0.1384, 0.1373, 0.1231, 0.1309],
        [0.1366, 0.1116, 0.1000, 0.1278, 0.1303, 0.1360, 0.1332, 0.1246],
        [0.1159, 0.1365, 0.1206, 0.1249, 0.1216, 0.1306, 0.1210, 0.1288],
        [0.1261, 0.1237, 0.1099, 0.1217, 0.1195, 0.1375, 0.1235, 0.1381],
        [0.1331, 0.1112, 0.0934, 0.1260, 0.1385, 0.1498, 0.1151, 0.1330],
        [0.1331, 0.1070, 0.0948, 0.1236, 0.1271, 0.1459, 0.1315, 0.1370],
        [0.1330, 0.1015, 0.0906, 0.1077, 0.1323, 0.1401, 0.1484, 0.1464]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 13 [   0/390]  Loss: 0.4828 (0.483)  Acc@1: 84.3750 (84.3750)  Acc@5: 98.4375 (98.4375)LR: 2.121e-02
Train: 13 [  50/390]  Loss: 0.4291 (0.478)  Acc@1: 87.5000 (83.7316)  Acc@5: 100.0000 (99.3566)LR: 2.121e-02
Train: 13 [ 100/390]  Loss: 0.5031 (0.469)  Acc@1: 85.9375 (83.9109)  Acc@5: 98.4375 (99.2729)LR: 2.121e-02
Train: 13 [ 150/390]  Loss: 0.4750 (0.459)  Acc@1: 90.6250 (84.3233)  Acc@5: 100.0000 (99.2653)LR: 2.121e-02
Train: 13 [ 200/390]  Loss: 0.4642 (0.459)  Acc@1: 78.1250 (84.3206)  Acc@5: 100.0000 (99.2537)LR: 2.121e-02
Train: 13 [ 250/390]  Loss: 0.3289 (0.458)  Acc@1: 87.5000 (84.2318)  Acc@5: 100.0000 (99.2779)LR: 2.121e-02
Train: 13 [ 300/390]  Loss: 0.4018 (0.458)  Acc@1: 90.6250 (84.0843)  Acc@5: 100.0000 (99.3044)LR: 2.121e-02
Train: 13 [ 350/390]  Loss: 0.4691 (0.459)  Acc@1: 82.8125 (84.0901)  Acc@5: 100.0000 (99.2967)LR: 2.121e-02
Train: 13 [ 390/390]  Loss: 0.6130 (0.458)  Acc@1: 85.0000 (84.0480)  Acc@5: 97.5000 (99.3200)LR: 2.121e-02
train_acc 84.048000
Valid: 13 [   0/390]  Loss: 0.5299 (0.530)  Acc@1: 84.3750 (84.3750)  Acc@5: 98.4375 (98.4375)
Valid: 13 [  50/390]  Loss: 0.2229 (0.537)  Acc@1: 93.7500 (82.0466)  Acc@5: 100.0000 (99.0502)
Valid: 13 [ 100/390]  Loss: 0.5885 (0.542)  Acc@1: 78.1250 (82.1318)  Acc@5: 100.0000 (99.1337)
Valid: 13 [ 150/390]  Loss: 0.4464 (0.540)  Acc@1: 79.6875 (81.9847)  Acc@5: 100.0000 (99.1722)
Valid: 13 [ 200/390]  Loss: 0.4339 (0.548)  Acc@1: 87.5000 (81.9419)  Acc@5: 98.4375 (99.0827)
Valid: 13 [ 250/390]  Loss: 0.5317 (0.540)  Acc@1: 84.3750 (82.0530)  Acc@5: 96.8750 (99.0662)
Valid: 13 [ 300/390]  Loss: 0.5659 (0.541)  Acc@1: 82.8125 (81.9508)  Acc@5: 100.0000 (99.0864)
Valid: 13 [ 350/390]  Loss: 0.6140 (0.540)  Acc@1: 71.8750 (81.9311)  Acc@5: 98.4375 (99.1097)
Valid: 13 [ 390/390]  Loss: 0.8375 (0.538)  Acc@1: 70.0000 (82.0600)  Acc@5: 100.0000 (99.0760)
valid_acc 82.060000
epoch = 13   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('dil_conv_3x3', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1383, 0.1071, 0.0794, 0.1063, 0.1561, 0.1456, 0.1351, 0.1321],
        [0.1373, 0.0975, 0.0754, 0.0963, 0.1704, 0.1547, 0.1365, 0.1318],
        [0.1478, 0.1104, 0.0825, 0.1089, 0.1583, 0.1336, 0.1215, 0.1370],
        [0.1461, 0.1020, 0.0825, 0.1055, 0.1386, 0.1522, 0.1371, 0.1360],
        [0.1589, 0.0935, 0.0757, 0.1063, 0.1436, 0.1422, 0.1397, 0.1400],
        [0.1632, 0.1084, 0.0826, 0.1112, 0.1414, 0.1348, 0.1296, 0.1288],
        [0.1596, 0.0988, 0.0812, 0.1015, 0.1526, 0.1437, 0.1297, 0.1330],
        [0.1829, 0.0879, 0.0755, 0.1077, 0.1352, 0.1361, 0.1291, 0.1456],
        [0.2074, 0.0803, 0.0724, 0.0953, 0.1346, 0.1466, 0.1376, 0.1258],
        [0.1655, 0.1091, 0.0846, 0.1081, 0.1407, 0.1294, 0.1283, 0.1343],
        [0.1896, 0.0978, 0.0813, 0.0992, 0.1429, 0.1377, 0.1238, 0.1278],
        [0.1742, 0.0849, 0.0721, 0.0941, 0.1459, 0.1456, 0.1443, 0.1388],
        [0.2029, 0.0778, 0.0692, 0.0844, 0.1331, 0.1397, 0.1504, 0.1425],
        [0.2168, 0.0727, 0.0658, 0.0726, 0.1334, 0.1403, 0.1467, 0.1517]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1163, 0.1376, 0.1190, 0.1155, 0.1420, 0.1395, 0.1110, 0.1191],
        [0.1298, 0.1261, 0.1088, 0.1259, 0.1360, 0.1211, 0.1299, 0.1224],
        [0.1161, 0.1352, 0.1163, 0.1316, 0.1304, 0.1169, 0.1279, 0.1256],
        [0.1268, 0.1280, 0.1125, 0.1290, 0.1293, 0.1245, 0.1266, 0.1234],
        [0.1223, 0.1103, 0.0904, 0.1183, 0.1428, 0.1365, 0.1386, 0.1408],
        [0.1186, 0.1448, 0.1261, 0.1188, 0.1198, 0.1242, 0.1270, 0.1207],
        [0.1297, 0.1286, 0.1137, 0.1184, 0.1298, 0.1278, 0.1260, 0.1261],
        [0.1330, 0.1134, 0.0948, 0.1298, 0.1381, 0.1363, 0.1236, 0.1309],
        [0.1376, 0.1101, 0.0986, 0.1291, 0.1300, 0.1356, 0.1339, 0.1251],
        [0.1152, 0.1383, 0.1207, 0.1249, 0.1220, 0.1310, 0.1196, 0.1282],
        [0.1264, 0.1231, 0.1092, 0.1216, 0.1186, 0.1387, 0.1235, 0.1389],
        [0.1323, 0.1105, 0.0927, 0.1271, 0.1389, 0.1497, 0.1139, 0.1348],
        [0.1331, 0.1060, 0.0937, 0.1249, 0.1285, 0.1451, 0.1314, 0.1375],
        [0.1344, 0.0997, 0.0891, 0.1077, 0.1315, 0.1404, 0.1499, 0.1473]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 14 [   0/390]  Loss: 0.6630 (0.663)  Acc@1: 75.0000 (75.0000)  Acc@5: 98.4375 (98.4375)LR: 2.065e-02
Train: 14 [  50/390]  Loss: 0.3294 (0.432)  Acc@1: 89.0625 (84.7120)  Acc@5: 100.0000 (99.4179)LR: 2.065e-02
Train: 14 [ 100/390]  Loss: 0.4297 (0.431)  Acc@1: 87.5000 (84.5916)  Acc@5: 100.0000 (99.4121)LR: 2.065e-02
Train: 14 [ 150/390]  Loss: 0.5984 (0.439)  Acc@1: 79.6875 (84.7268)  Acc@5: 98.4375 (99.3998)LR: 2.065e-02
Train: 14 [ 200/390]  Loss: 0.3943 (0.435)  Acc@1: 84.3750 (84.7248)  Acc@5: 100.0000 (99.4170)LR: 2.065e-02
Train: 14 [ 250/390]  Loss: 0.5325 (0.439)  Acc@1: 81.2500 (84.6365)  Acc@5: 98.4375 (99.3837)LR: 2.065e-02
Train: 14 [ 300/390]  Loss: 0.4119 (0.437)  Acc@1: 87.5000 (84.8370)  Acc@5: 100.0000 (99.4030)LR: 2.065e-02
Train: 14 [ 350/390]  Loss: 0.2664 (0.443)  Acc@1: 92.1875 (84.5976)  Acc@5: 100.0000 (99.3946)LR: 2.065e-02
Train: 14 [ 390/390]  Loss: 0.3195 (0.442)  Acc@1: 90.0000 (84.6760)  Acc@5: 100.0000 (99.4120)LR: 2.065e-02
train_acc 84.676000
Valid: 14 [   0/390]  Loss: 0.5790 (0.579)  Acc@1: 82.8125 (82.8125)  Acc@5: 98.4375 (98.4375)
Valid: 14 [  50/390]  Loss: 0.5724 (0.517)  Acc@1: 81.2500 (82.7819)  Acc@5: 96.8750 (99.1115)
Valid: 14 [ 100/390]  Loss: 0.4257 (0.500)  Acc@1: 84.3750 (83.1219)  Acc@5: 100.0000 (99.1801)
Valid: 14 [ 150/390]  Loss: 0.3093 (0.497)  Acc@1: 90.6250 (83.0401)  Acc@5: 100.0000 (99.1722)
Valid: 14 [ 200/390]  Loss: 0.7892 (0.502)  Acc@1: 76.5625 (82.8280)  Acc@5: 98.4375 (99.1838)
Valid: 14 [ 250/390]  Loss: 0.3666 (0.502)  Acc@1: 87.5000 (82.8187)  Acc@5: 100.0000 (99.2156)
Valid: 14 [ 300/390]  Loss: 0.4721 (0.503)  Acc@1: 81.2500 (82.7606)  Acc@5: 100.0000 (99.2421)
Valid: 14 [ 350/390]  Loss: 0.5332 (0.499)  Acc@1: 79.6875 (82.8348)  Acc@5: 98.4375 (99.2477)
Valid: 14 [ 390/390]  Loss: 0.5276 (0.500)  Acc@1: 77.5000 (82.8360)  Acc@5: 100.0000 (99.2160)
valid_acc 82.836000
epoch = 14   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 2), ('dil_conv_3x3', 4)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1403, 0.1044, 0.0768, 0.1051, 0.1588, 0.1464, 0.1356, 0.1325],
        [0.1376, 0.0955, 0.0731, 0.0953, 0.1740, 0.1553, 0.1363, 0.1328],
        [0.1514, 0.1082, 0.0805, 0.1089, 0.1605, 0.1325, 0.1204, 0.1375],
        [0.1466, 0.1001, 0.0806, 0.1051, 0.1394, 0.1540, 0.1388, 0.1353],
        [0.1617, 0.0904, 0.0728, 0.1053, 0.1460, 0.1433, 0.1401, 0.1404],
        [0.1690, 0.1061, 0.0804, 0.1113, 0.1402, 0.1338, 0.1298, 0.1293],
        [0.1629, 0.0963, 0.0791, 0.1009, 0.1541, 0.1436, 0.1299, 0.1331],
        [0.1894, 0.0850, 0.0729, 0.1070, 0.1351, 0.1363, 0.1283, 0.1460],
        [0.2177, 0.0771, 0.0695, 0.0937, 0.1318, 0.1476, 0.1381, 0.1244],
        [0.1725, 0.1065, 0.0827, 0.1085, 0.1401, 0.1284, 0.1260, 0.1353],
        [0.1977, 0.0950, 0.0788, 0.0978, 0.1422, 0.1368, 0.1245, 0.1272],
        [0.1804, 0.0815, 0.0691, 0.0926, 0.1472, 0.1438, 0.1458, 0.1396],
        [0.2114, 0.0744, 0.0662, 0.0823, 0.1325, 0.1394, 0.1511, 0.1427],
        [0.2252, 0.0689, 0.0622, 0.0693, 0.1333, 0.1389, 0.1484, 0.1538]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1162, 0.1369, 0.1172, 0.1160, 0.1441, 0.1398, 0.1103, 0.1195],
        [0.1290, 0.1264, 0.1083, 0.1269, 0.1365, 0.1202, 0.1311, 0.1216],
        [0.1158, 0.1349, 0.1149, 0.1320, 0.1319, 0.1165, 0.1292, 0.1249],
        [0.1257, 0.1281, 0.1122, 0.1291, 0.1299, 0.1236, 0.1279, 0.1234],
        [0.1224, 0.1099, 0.0888, 0.1192, 0.1427, 0.1364, 0.1392, 0.1413],
        [0.1193, 0.1443, 0.1251, 0.1188, 0.1204, 0.1243, 0.1280, 0.1199],
        [0.1297, 0.1297, 0.1143, 0.1180, 0.1302, 0.1271, 0.1264, 0.1247],
        [0.1330, 0.1124, 0.0930, 0.1310, 0.1391, 0.1359, 0.1239, 0.1317],
        [0.1401, 0.1094, 0.0974, 0.1310, 0.1284, 0.1346, 0.1331, 0.1260],
        [0.1155, 0.1380, 0.1193, 0.1254, 0.1218, 0.1329, 0.1184, 0.1287],
        [0.1251, 0.1237, 0.1092, 0.1213, 0.1191, 0.1396, 0.1237, 0.1383],
        [0.1317, 0.1096, 0.0905, 0.1269, 0.1399, 0.1513, 0.1135, 0.1366],
        [0.1321, 0.1049, 0.0919, 0.1252, 0.1300, 0.1456, 0.1320, 0.1383],
        [0.1365, 0.0976, 0.0865, 0.1064, 0.1324, 0.1405, 0.1504, 0.1497]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 15 [   0/390]  Loss: 0.4242 (0.424)  Acc@1: 84.3750 (84.3750)  Acc@5: 100.0000 (100.0000)LR: 2.005e-02
Train: 15 [  50/390]  Loss: 0.2645 (0.428)  Acc@1: 89.0625 (84.7733)  Acc@5: 100.0000 (99.4792)LR: 2.005e-02
Train: 15 [ 100/390]  Loss: 0.3003 (0.429)  Acc@1: 92.1875 (84.9474)  Acc@5: 100.0000 (99.4895)LR: 2.005e-02
Train: 15 [ 150/390]  Loss: 0.5899 (0.430)  Acc@1: 78.1250 (84.9027)  Acc@5: 98.4375 (99.4930)LR: 2.005e-02
Train: 15 [ 200/390]  Loss: 0.2658 (0.433)  Acc@1: 92.1875 (84.8103)  Acc@5: 100.0000 (99.4947)LR: 2.005e-02
Train: 15 [ 250/390]  Loss: 0.3930 (0.439)  Acc@1: 84.3750 (84.5306)  Acc@5: 100.0000 (99.4522)LR: 2.005e-02
Train: 15 [ 300/390]  Loss: 0.3996 (0.435)  Acc@1: 84.3750 (84.7280)  Acc@5: 98.4375 (99.4186)LR: 2.005e-02
Train: 15 [ 350/390]  Loss: 0.3558 (0.436)  Acc@1: 85.9375 (84.6777)  Acc@5: 100.0000 (99.4391)LR: 2.005e-02
Train: 15 [ 390/390]  Loss: 0.3305 (0.435)  Acc@1: 90.0000 (84.7240)  Acc@5: 100.0000 (99.4280)LR: 2.005e-02
train_acc 84.724000
Valid: 15 [   0/390]  Loss: 0.4163 (0.416)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)
Valid: 15 [  50/390]  Loss: 0.6174 (0.483)  Acc@1: 79.6875 (82.9963)  Acc@5: 98.4375 (98.9583)
Valid: 15 [ 100/390]  Loss: 0.6278 (0.478)  Acc@1: 76.5625 (83.5241)  Acc@5: 100.0000 (99.2110)
Valid: 15 [ 150/390]  Loss: 0.4649 (0.467)  Acc@1: 85.9375 (84.0956)  Acc@5: 100.0000 (99.2239)
Valid: 15 [ 200/390]  Loss: 0.5938 (0.468)  Acc@1: 84.3750 (84.1340)  Acc@5: 100.0000 (99.2460)
Valid: 15 [ 250/390]  Loss: 0.4980 (0.471)  Acc@1: 79.6875 (84.0202)  Acc@5: 98.4375 (99.2219)
Valid: 15 [ 300/390]  Loss: 0.4311 (0.472)  Acc@1: 78.1250 (83.9026)  Acc@5: 100.0000 (99.2265)
Valid: 15 [ 350/390]  Loss: 0.6788 (0.471)  Acc@1: 81.2500 (83.9343)  Acc@5: 98.4375 (99.2254)
Valid: 15 [ 390/390]  Loss: 0.5109 (0.475)  Acc@1: 82.5000 (83.7920)  Acc@5: 100.0000 (99.2160)
valid_acc 83.792000
epoch = 15   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 2), ('dil_conv_3x3', 4)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1416, 0.1023, 0.0750, 0.1048, 0.1616, 0.1453, 0.1366, 0.1328],
        [0.1388, 0.0933, 0.0709, 0.0941, 0.1774, 0.1565, 0.1348, 0.1342],
        [0.1544, 0.1061, 0.0786, 0.1085, 0.1612, 0.1338, 0.1199, 0.1374],
        [0.1479, 0.0980, 0.0786, 0.1046, 0.1397, 0.1571, 0.1387, 0.1353],
        [0.1655, 0.0873, 0.0705, 0.1043, 0.1464, 0.1447, 0.1409, 0.1405],
        [0.1728, 0.1045, 0.0788, 0.1115, 0.1403, 0.1324, 0.1305, 0.1293],
        [0.1675, 0.0937, 0.0769, 0.0998, 0.1575, 0.1425, 0.1301, 0.1322],
        [0.1958, 0.0818, 0.0702, 0.1057, 0.1364, 0.1352, 0.1288, 0.1460],
        [0.2262, 0.0747, 0.0672, 0.0922, 0.1316, 0.1475, 0.1380, 0.1226],
        [0.1789, 0.1038, 0.0805, 0.1078, 0.1400, 0.1276, 0.1250, 0.1364],
        [0.2073, 0.0921, 0.0762, 0.0965, 0.1421, 0.1353, 0.1238, 0.1269],
        [0.1888, 0.0781, 0.0664, 0.0914, 0.1478, 0.1415, 0.1465, 0.1395],
        [0.2206, 0.0712, 0.0632, 0.0801, 0.1318, 0.1389, 0.1515, 0.1426],
        [0.2357, 0.0654, 0.0592, 0.0663, 0.1338, 0.1380, 0.1490, 0.1526]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1159, 0.1375, 0.1174, 0.1160, 0.1431, 0.1407, 0.1108, 0.1187],
        [0.1286, 0.1259, 0.1078, 0.1276, 0.1363, 0.1204, 0.1311, 0.1222],
        [0.1154, 0.1360, 0.1157, 0.1322, 0.1313, 0.1172, 0.1271, 0.1251],
        [0.1252, 0.1276, 0.1123, 0.1306, 0.1294, 0.1229, 0.1281, 0.1240],
        [0.1237, 0.1087, 0.0884, 0.1205, 0.1417, 0.1367, 0.1383, 0.1420],
        [0.1185, 0.1454, 0.1259, 0.1187, 0.1202, 0.1245, 0.1273, 0.1195],
        [0.1295, 0.1300, 0.1145, 0.1181, 0.1311, 0.1269, 0.1256, 0.1242],
        [0.1339, 0.1103, 0.0924, 0.1325, 0.1391, 0.1364, 0.1236, 0.1318],
        [0.1420, 0.1079, 0.0966, 0.1325, 0.1275, 0.1337, 0.1313, 0.1284],
        [0.1141, 0.1383, 0.1191, 0.1265, 0.1226, 0.1348, 0.1168, 0.1280],
        [0.1249, 0.1236, 0.1094, 0.1221, 0.1195, 0.1410, 0.1230, 0.1365],
        [0.1307, 0.1081, 0.0900, 0.1277, 0.1404, 0.1534, 0.1125, 0.1372],
        [0.1335, 0.1032, 0.0910, 0.1258, 0.1292, 0.1461, 0.1317, 0.1396],
        [0.1368, 0.0964, 0.0852, 0.1060, 0.1335, 0.1394, 0.1517, 0.1510]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 16 [   0/390]  Loss: 0.3400 (0.340)  Acc@1: 92.1875 (92.1875)  Acc@5: 98.4375 (98.4375)LR: 1.943e-02
Train: 16 [  50/390]  Loss: 0.3080 (0.387)  Acc@1: 89.0625 (87.0098)  Acc@5: 98.4375 (99.6936)LR: 1.943e-02
Train: 16 [ 100/390]  Loss: 0.5228 (0.391)  Acc@1: 79.6875 (86.7110)  Acc@5: 100.0000 (99.5359)LR: 1.943e-02
Train: 16 [ 150/390]  Loss: 0.1947 (0.392)  Acc@1: 93.7500 (86.5480)  Acc@5: 100.0000 (99.5344)LR: 1.943e-02
Train: 16 [ 200/390]  Loss: 0.2807 (0.398)  Acc@1: 87.5000 (86.3029)  Acc@5: 98.4375 (99.4947)LR: 1.943e-02
Train: 16 [ 250/390]  Loss: 0.5017 (0.399)  Acc@1: 87.5000 (86.2861)  Acc@5: 100.0000 (99.4709)LR: 1.943e-02
Train: 16 [ 300/390]  Loss: 0.3217 (0.406)  Acc@1: 87.5000 (86.0361)  Acc@5: 98.4375 (99.4394)LR: 1.943e-02
Train: 16 [ 350/390]  Loss: 0.3435 (0.404)  Acc@1: 93.7500 (85.9865)  Acc@5: 100.0000 (99.4658)LR: 1.943e-02
Valid: 16 [   0/390]  Loss: 0.4228 (0.423)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)
Train: 16 [ 390/390]  Loss: 0.1935 (0.409)  Acc@1: 95.0000 (85.8280)  Acc@5: 100.0000 (99.4560)LR: 1.943e-02
train_acc 85.828000
Valid: 16 [  50/390]  Loss: 0.5836 (0.487)  Acc@1: 79.6875 (84.2525)  Acc@5: 100.0000 (99.0196)
Valid: 16 [ 100/390]  Loss: 0.6408 (0.495)  Acc@1: 84.3750 (83.7562)  Acc@5: 98.4375 (99.0718)
Valid: 16 [ 150/390]  Loss: 0.4586 (0.504)  Acc@1: 82.8125 (83.4023)  Acc@5: 100.0000 (99.1204)
Valid: 16 [ 200/390]  Loss: 0.4959 (0.499)  Acc@1: 81.2500 (83.4499)  Acc@5: 100.0000 (99.1838)
Valid: 16 [ 250/390]  Loss: 0.6492 (0.507)  Acc@1: 81.2500 (83.1985)  Acc@5: 98.4375 (99.1534)
Valid: 16 [ 300/390]  Loss: 0.5046 (0.514)  Acc@1: 84.3750 (83.1136)  Acc@5: 100.0000 (99.0968)
Valid: 16 [ 350/390]  Loss: 0.5055 (0.513)  Acc@1: 85.9375 (83.1419)  Acc@5: 98.4375 (99.1453)
Valid: 16 [ 390/390]  Loss: 0.3639 (0.509)  Acc@1: 90.0000 (83.2360)  Acc@5: 100.0000 (99.1600)
valid_acc 83.236000
epoch = 16   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1435, 0.0998, 0.0736, 0.1054, 0.1642, 0.1437, 0.1376, 0.1323],
        [0.1396, 0.0907, 0.0687, 0.0930, 0.1818, 0.1580, 0.1334, 0.1350],
        [0.1581, 0.1033, 0.0770, 0.1089, 0.1629, 0.1336, 0.1189, 0.1374],
        [0.1489, 0.0959, 0.0769, 0.1046, 0.1410, 0.1588, 0.1398, 0.1339],
        [0.1693, 0.0839, 0.0678, 0.1029, 0.1469, 0.1479, 0.1405, 0.1408],
        [0.1768, 0.1022, 0.0779, 0.1125, 0.1393, 0.1318, 0.1310, 0.1284],
        [0.1711, 0.0917, 0.0754, 0.0998, 0.1590, 0.1432, 0.1292, 0.1307],
        [0.2012, 0.0786, 0.0678, 0.1043, 0.1388, 0.1351, 0.1297, 0.1444],
        [0.2358, 0.0721, 0.0648, 0.0904, 0.1297, 0.1477, 0.1386, 0.1209],
        [0.1843, 0.1009, 0.0791, 0.1084, 0.1385, 0.1272, 0.1247, 0.1368],
        [0.2160, 0.0889, 0.0737, 0.0950, 0.1437, 0.1337, 0.1230, 0.1260],
        [0.1969, 0.0746, 0.0635, 0.0896, 0.1486, 0.1407, 0.1477, 0.1382],
        [0.2290, 0.0681, 0.0606, 0.0779, 0.1321, 0.1380, 0.1512, 0.1432],
        [0.2473, 0.0621, 0.0563, 0.0637, 0.1342, 0.1359, 0.1485, 0.1521]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1158, 0.1373, 0.1171, 0.1160, 0.1433, 0.1414, 0.1118, 0.1174],
        [0.1278, 0.1244, 0.1059, 0.1281, 0.1383, 0.1204, 0.1328, 0.1224],
        [0.1146, 0.1361, 0.1158, 0.1327, 0.1327, 0.1160, 0.1279, 0.1243],
        [0.1259, 0.1268, 0.1114, 0.1308, 0.1289, 0.1232, 0.1289, 0.1241],
        [0.1232, 0.1066, 0.0876, 0.1204, 0.1430, 0.1374, 0.1390, 0.1426],
        [0.1175, 0.1457, 0.1270, 0.1187, 0.1199, 0.1230, 0.1288, 0.1195],
        [0.1301, 0.1299, 0.1143, 0.1186, 0.1319, 0.1252, 0.1258, 0.1241],
        [0.1335, 0.1078, 0.0921, 0.1329, 0.1389, 0.1375, 0.1250, 0.1323],
        [0.1438, 0.1061, 0.0961, 0.1341, 0.1283, 0.1328, 0.1307, 0.1282],
        [0.1129, 0.1374, 0.1190, 0.1279, 0.1234, 0.1366, 0.1155, 0.1274],
        [0.1257, 0.1218, 0.1074, 0.1219, 0.1201, 0.1406, 0.1249, 0.1376],
        [0.1313, 0.1058, 0.0891, 0.1280, 0.1420, 0.1534, 0.1126, 0.1378],
        [0.1334, 0.1010, 0.0895, 0.1252, 0.1299, 0.1471, 0.1334, 0.1405],
        [0.1384, 0.0938, 0.0832, 0.1042, 0.1348, 0.1390, 0.1530, 0.1536]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 17 [   0/390]  Loss: 0.3356 (0.336)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)LR: 1.878e-02
Train: 17 [  50/390]  Loss: 0.3907 (0.395)  Acc@1: 82.8125 (85.7230)  Acc@5: 100.0000 (99.2647)LR: 1.878e-02
Train: 17 [ 100/390]  Loss: 0.5198 (0.389)  Acc@1: 81.2500 (86.0922)  Acc@5: 100.0000 (99.4740)LR: 1.878e-02
Train: 17 [ 150/390]  Loss: 0.4263 (0.387)  Acc@1: 81.2500 (86.2686)  Acc@5: 100.0000 (99.4930)LR: 1.878e-02
Train: 17 [ 200/390]  Loss: 0.5124 (0.387)  Acc@1: 75.0000 (86.2251)  Acc@5: 100.0000 (99.5414)LR: 1.878e-02
Train: 17 [ 250/390]  Loss: 0.3491 (0.392)  Acc@1: 87.5000 (86.1305)  Acc@5: 100.0000 (99.5705)LR: 1.878e-02
Train: 17 [ 300/390]  Loss: 0.4066 (0.394)  Acc@1: 82.8125 (86.1296)  Acc@5: 100.0000 (99.5328)LR: 1.878e-02
Train: 17 [ 350/390]  Loss: 0.5250 (0.399)  Acc@1: 79.6875 (86.0755)  Acc@5: 98.4375 (99.5192)LR: 1.878e-02
Train: 17 [ 390/390]  Loss: 0.3717 (0.396)  Acc@1: 87.5000 (86.2040)  Acc@5: 100.0000 (99.5240)LR: 1.878e-02
train_acc 86.204000
Valid: 17 [   0/390]  Loss: 0.5909 (0.591)  Acc@1: 81.2500 (81.2500)  Acc@5: 96.8750 (96.8750)
Valid: 17 [  50/390]  Loss: 0.3989 (0.469)  Acc@1: 85.9375 (84.4363)  Acc@5: 100.0000 (99.2953)
Valid: 17 [ 100/390]  Loss: 0.4182 (0.475)  Acc@1: 84.3750 (84.1120)  Acc@5: 100.0000 (99.3348)
Valid: 17 [ 150/390]  Loss: 0.2738 (0.481)  Acc@1: 90.6250 (84.1370)  Acc@5: 100.0000 (99.2653)
Valid: 17 [ 200/390]  Loss: 0.5174 (0.479)  Acc@1: 81.2500 (84.2118)  Acc@5: 100.0000 (99.2693)
Valid: 17 [ 250/390]  Loss: 0.4933 (0.474)  Acc@1: 85.9375 (84.3625)  Acc@5: 98.4375 (99.2654)
Valid: 17 [ 300/390]  Loss: 0.3209 (0.472)  Acc@1: 90.6250 (84.3594)  Acc@5: 100.0000 (99.2577)
Valid: 17 [ 350/390]  Loss: 0.4063 (0.473)  Acc@1: 85.9375 (84.3305)  Acc@5: 98.4375 (99.2254)
Valid: 17 [ 390/390]  Loss: 1.073 (0.476)  Acc@1: 67.5000 (84.2320)  Acc@5: 95.0000 (99.2120)
valid_acc 84.232000
epoch = 17   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_3x3', 3), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1445, 0.0974, 0.0721, 0.1054, 0.1683, 0.1428, 0.1383, 0.1311],
        [0.1407, 0.0883, 0.0673, 0.0922, 0.1816, 0.1588, 0.1345, 0.1367],
        [0.1598, 0.1018, 0.0760, 0.1096, 0.1639, 0.1329, 0.1178, 0.1381],
        [0.1517, 0.0941, 0.0758, 0.1049, 0.1427, 0.1594, 0.1377, 0.1336],
        [0.1714, 0.0813, 0.0658, 0.1017, 0.1471, 0.1486, 0.1422, 0.1418],
        [0.1810, 0.1005, 0.0765, 0.1122, 0.1385, 0.1313, 0.1315, 0.1285],
        [0.1749, 0.0907, 0.0749, 0.1006, 0.1597, 0.1415, 0.1280, 0.1297],
        [0.2068, 0.0762, 0.0657, 0.1033, 0.1412, 0.1342, 0.1292, 0.1434],
        [0.2451, 0.0699, 0.0629, 0.0893, 0.1290, 0.1462, 0.1375, 0.1201],
        [0.1898, 0.0987, 0.0778, 0.1087, 0.1392, 0.1265, 0.1226, 0.1367],
        [0.2233, 0.0867, 0.0726, 0.0948, 0.1449, 0.1321, 0.1218, 0.1238],
        [0.2047, 0.0718, 0.0614, 0.0886, 0.1502, 0.1385, 0.1469, 0.1380],
        [0.2376, 0.0659, 0.0589, 0.0767, 0.1304, 0.1353, 0.1525, 0.1427],
        [0.2572, 0.0595, 0.0542, 0.0616, 0.1330, 0.1339, 0.1485, 0.1522]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1157, 0.1376, 0.1166, 0.1150, 0.1442, 0.1417, 0.1123, 0.1169],
        [0.1271, 0.1238, 0.1050, 0.1278, 0.1393, 0.1207, 0.1334, 0.1228],
        [0.1143, 0.1359, 0.1148, 0.1332, 0.1325, 0.1163, 0.1283, 0.1246],
        [0.1262, 0.1266, 0.1111, 0.1310, 0.1295, 0.1239, 0.1292, 0.1225],
        [0.1219, 0.1054, 0.0871, 0.1199, 0.1442, 0.1383, 0.1392, 0.1441],
        [0.1180, 0.1464, 0.1270, 0.1185, 0.1191, 0.1215, 0.1306, 0.1189],
        [0.1303, 0.1298, 0.1142, 0.1182, 0.1329, 0.1247, 0.1260, 0.1239],
        [0.1343, 0.1061, 0.0917, 0.1334, 0.1398, 0.1385, 0.1247, 0.1316],
        [0.1438, 0.1043, 0.0956, 0.1345, 0.1278, 0.1347, 0.1304, 0.1289],
        [0.1129, 0.1372, 0.1188, 0.1295, 0.1234, 0.1375, 0.1142, 0.1266],
        [0.1254, 0.1218, 0.1071, 0.1215, 0.1210, 0.1403, 0.1265, 0.1365],
        [0.1312, 0.1043, 0.0881, 0.1275, 0.1418, 0.1544, 0.1133, 0.1394],
        [0.1330, 0.0987, 0.0881, 0.1244, 0.1324, 0.1475, 0.1337, 0.1423],
        [0.1408, 0.0922, 0.0820, 0.1035, 0.1341, 0.1391, 0.1525, 0.1557]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 18 [   0/390]  Loss: 0.2439 (0.244)  Acc@1: 93.7500 (93.7500)  Acc@5: 100.0000 (100.0000)LR: 1.811e-02
Train: 18 [  50/390]  Loss: 0.3908 (0.385)  Acc@1: 82.8125 (86.7647)  Acc@5: 100.0000 (99.6017)LR: 1.811e-02
Train: 18 [ 100/390]  Loss: 0.3024 (0.381)  Acc@1: 90.6250 (86.8812)  Acc@5: 100.0000 (99.5204)LR: 1.811e-02
Train: 18 [ 150/390]  Loss: 0.3702 (0.388)  Acc@1: 89.0625 (86.5894)  Acc@5: 98.4375 (99.4930)LR: 1.811e-02
Train: 18 [ 200/390]  Loss: 0.2286 (0.397)  Acc@1: 92.1875 (86.4195)  Acc@5: 100.0000 (99.4636)LR: 1.811e-02
Train: 18 [ 250/390]  Loss: 0.1460 (0.388)  Acc@1: 95.3125 (86.6347)  Acc@5: 100.0000 (99.5144)LR: 1.811e-02
Train: 18 [ 300/390]  Loss: 0.4058 (0.394)  Acc@1: 85.9375 (86.3995)  Acc@5: 100.0000 (99.4861)LR: 1.811e-02
Train: 18 [ 350/390]  Loss: 0.4604 (0.392)  Acc@1: 82.8125 (86.4272)  Acc@5: 100.0000 (99.4881)LR: 1.811e-02
Train: 18 [ 390/390]  Loss: 0.5104 (0.395)  Acc@1: 80.0000 (86.3200)  Acc@5: 100.0000 (99.4680)LR: 1.811e-02
train_acc 86.320000
Valid: 18 [   0/390]  Loss: 0.4493 (0.449)  Acc@1: 87.5000 (87.5000)  Acc@5: 100.0000 (100.0000)
Valid: 18 [  50/390]  Loss: 0.5405 (0.491)  Acc@1: 79.6875 (83.5784)  Acc@5: 100.0000 (99.3260)
Valid: 18 [ 100/390]  Loss: 0.6256 (0.498)  Acc@1: 78.1250 (83.2766)  Acc@5: 98.4375 (99.3348)
Valid: 18 [ 150/390]  Loss: 0.4526 (0.495)  Acc@1: 89.0625 (83.4644)  Acc@5: 98.4375 (99.3791)
Valid: 18 [ 200/390]  Loss: 0.4830 (0.496)  Acc@1: 87.5000 (83.5121)  Acc@5: 100.0000 (99.3392)
Valid: 18 [ 250/390]  Loss: 0.3883 (0.485)  Acc@1: 87.5000 (83.8708)  Acc@5: 98.4375 (99.3028)
Valid: 18 [ 300/390]  Loss: 0.5520 (0.485)  Acc@1: 79.6875 (83.8403)  Acc@5: 98.4375 (99.2992)
Valid: 18 [ 350/390]  Loss: 0.5643 (0.484)  Acc@1: 82.8125 (83.7740)  Acc@5: 98.4375 (99.3234)
Valid: 18 [ 390/390]  Loss: 0.6372 (0.480)  Acc@1: 82.5000 (83.8960)  Acc@5: 97.5000 (99.3160)
valid_acc 83.896000
epoch = 18   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_3x3', 3), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_5x5', 2), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1463, 0.0960, 0.0704, 0.1056, 0.1703, 0.1418, 0.1396, 0.1301],
        [0.1424, 0.0868, 0.0655, 0.0916, 0.1830, 0.1612, 0.1327, 0.1367],
        [0.1610, 0.1004, 0.0744, 0.1097, 0.1673, 0.1327, 0.1178, 0.1367],
        [0.1546, 0.0931, 0.0745, 0.1054, 0.1427, 0.1577, 0.1371, 0.1349],
        [0.1753, 0.0782, 0.0636, 0.1002, 0.1468, 0.1511, 0.1429, 0.1419],
        [0.1849, 0.0990, 0.0750, 0.1124, 0.1382, 0.1299, 0.1326, 0.1280],
        [0.1801, 0.0892, 0.0737, 0.1012, 0.1588, 0.1404, 0.1267, 0.1298],
        [0.2152, 0.0732, 0.0635, 0.1022, 0.1404, 0.1330, 0.1296, 0.1429],
        [0.2562, 0.0678, 0.0614, 0.0887, 0.1275, 0.1449, 0.1350, 0.1185],
        [0.1959, 0.0970, 0.0763, 0.1088, 0.1386, 0.1250, 0.1217, 0.1368],
        [0.2330, 0.0850, 0.0708, 0.0944, 0.1436, 0.1310, 0.1192, 0.1230],
        [0.2141, 0.0688, 0.0591, 0.0875, 0.1484, 0.1370, 0.1473, 0.1377],
        [0.2477, 0.0633, 0.0569, 0.0754, 0.1287, 0.1336, 0.1529, 0.1415],
        [0.2683, 0.0568, 0.0517, 0.0593, 0.1321, 0.1330, 0.1484, 0.1505]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1163, 0.1380, 0.1159, 0.1148, 0.1442, 0.1419, 0.1120, 0.1169],
        [0.1257, 0.1234, 0.1039, 0.1278, 0.1413, 0.1198, 0.1350, 0.1232],
        [0.1140, 0.1362, 0.1149, 0.1336, 0.1323, 0.1162, 0.1290, 0.1239],
        [0.1246, 0.1263, 0.1104, 0.1314, 0.1307, 0.1237, 0.1309, 0.1219],
        [0.1230, 0.1050, 0.0867, 0.1217, 0.1444, 0.1360, 0.1388, 0.1444],
        [0.1178, 0.1471, 0.1271, 0.1184, 0.1196, 0.1215, 0.1310, 0.1176],
        [0.1301, 0.1302, 0.1142, 0.1170, 0.1338, 0.1244, 0.1271, 0.1232],
        [0.1336, 0.1056, 0.0913, 0.1354, 0.1395, 0.1397, 0.1241, 0.1308],
        [0.1445, 0.1022, 0.0943, 0.1351, 0.1288, 0.1364, 0.1298, 0.1291],
        [0.1129, 0.1373, 0.1180, 0.1299, 0.1249, 0.1379, 0.1139, 0.1253],
        [0.1242, 0.1219, 0.1069, 0.1208, 0.1201, 0.1403, 0.1287, 0.1370],
        [0.1310, 0.1034, 0.0874, 0.1288, 0.1409, 0.1536, 0.1130, 0.1419],
        [0.1352, 0.0970, 0.0871, 0.1254, 0.1331, 0.1477, 0.1319, 0.1426],
        [0.1423, 0.0907, 0.0806, 0.1030, 0.1344, 0.1387, 0.1537, 0.1567]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 19 [   0/390]  Loss: 0.3194 (0.319)  Acc@1: 90.6250 (90.6250)  Acc@5: 100.0000 (100.0000)LR: 1.742e-02
Train: 19 [  50/390]  Loss: 0.3755 (0.352)  Acc@1: 84.3750 (88.2966)  Acc@5: 100.0000 (99.6324)LR: 1.742e-02
Train: 19 [ 100/390]  Loss: 0.4440 (0.360)  Acc@1: 81.2500 (87.5774)  Acc@5: 98.4375 (99.5514)LR: 1.742e-02
Train: 19 [ 150/390]  Loss: 0.4114 (0.368)  Acc@1: 85.9375 (87.3137)  Acc@5: 98.4375 (99.5447)LR: 1.742e-02
Train: 19 [ 200/390]  Loss: 0.5322 (0.377)  Acc@1: 79.6875 (86.8937)  Acc@5: 98.4375 (99.5103)LR: 1.742e-02
Train: 19 [ 250/390]  Loss: 0.4013 (0.377)  Acc@1: 82.8125 (86.8650)  Acc@5: 100.0000 (99.5393)LR: 1.742e-02
Train: 19 [ 300/390]  Loss: 0.4924 (0.380)  Acc@1: 84.3750 (86.7629)  Acc@5: 98.4375 (99.5536)LR: 1.742e-02
Train: 19 [ 350/390]  Loss: 0.4081 (0.379)  Acc@1: 85.9375 (86.7388)  Acc@5: 100.0000 (99.5548)LR: 1.742e-02
Train: 19 [ 390/390]  Loss: 0.4520 (0.378)  Acc@1: 80.0000 (86.8080)  Acc@5: 100.0000 (99.5720)LR: 1.742e-02
train_acc 86.808000
Valid: 19 [   0/390]  Loss: 0.1271 (0.127)  Acc@1: 93.7500 (93.7500)  Acc@5: 100.0000 (100.0000)
Valid: 19 [  50/390]  Loss: 0.3545 (0.449)  Acc@1: 89.0625 (84.9571)  Acc@5: 98.4375 (99.3873)
Valid: 19 [ 100/390]  Loss: 0.6850 (0.451)  Acc@1: 82.8125 (85.1330)  Acc@5: 95.3125 (99.3038)
Valid: 19 [ 150/390]  Loss: 0.4998 (0.456)  Acc@1: 82.8125 (84.8096)  Acc@5: 98.4375 (99.3377)
Valid: 19 [ 200/390]  Loss: 0.4315 (0.458)  Acc@1: 84.3750 (84.7559)  Acc@5: 98.4375 (99.2926)
Valid: 19 [ 250/390]  Loss: 0.4219 (0.458)  Acc@1: 84.3750 (84.8294)  Acc@5: 98.4375 (99.2903)
Valid: 19 [ 300/390]  Loss: 0.3410 (0.460)  Acc@1: 87.5000 (84.7332)  Acc@5: 100.0000 (99.2888)
Valid: 19 [ 350/390]  Loss: 0.5592 (0.468)  Acc@1: 76.5625 (84.6287)  Acc@5: 100.0000 (99.2432)
Valid: 19 [ 390/390]  Loss: 0.6877 (0.467)  Acc@1: 82.5000 (84.6520)  Acc@5: 100.0000 (99.2560)
valid_acc 84.652000
epoch = 19   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_3x3', 3), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1472, 0.0935, 0.0681, 0.1043, 0.1726, 0.1422, 0.1409, 0.1311],
        [0.1443, 0.0839, 0.0631, 0.0896, 0.1875, 0.1614, 0.1324, 0.1379],
        [0.1635, 0.0988, 0.0724, 0.1094, 0.1700, 0.1318, 0.1169, 0.1372],
        [0.1588, 0.0909, 0.0723, 0.1039, 0.1431, 0.1578, 0.1376, 0.1354],
        [0.1786, 0.0760, 0.0614, 0.0989, 0.1476, 0.1532, 0.1432, 0.1410],
        [0.1890, 0.0973, 0.0735, 0.1126, 0.1373, 0.1306, 0.1316, 0.1279],
        [0.1859, 0.0873, 0.0718, 0.1002, 0.1601, 0.1405, 0.1250, 0.1292],
        [0.2214, 0.0711, 0.0613, 0.1010, 0.1403, 0.1317, 0.1301, 0.1431],
        [0.2675, 0.0650, 0.0586, 0.0860, 0.1253, 0.1442, 0.1349, 0.1185],
        [0.2024, 0.0951, 0.0744, 0.1085, 0.1369, 0.1241, 0.1210, 0.1376],
        [0.2450, 0.0823, 0.0681, 0.0922, 0.1443, 0.1285, 0.1182, 0.1213],
        [0.2201, 0.0663, 0.0566, 0.0857, 0.1492, 0.1365, 0.1472, 0.1384],
        [0.2575, 0.0607, 0.0543, 0.0730, 0.1285, 0.1321, 0.1539, 0.1401],
        [0.2784, 0.0541, 0.0490, 0.0566, 0.1317, 0.1325, 0.1466, 0.1512]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1155, 0.1380, 0.1148, 0.1143, 0.1444, 0.1429, 0.1121, 0.1180],
        [0.1257, 0.1229, 0.1035, 0.1283, 0.1423, 0.1198, 0.1357, 0.1218],
        [0.1128, 0.1368, 0.1141, 0.1339, 0.1336, 0.1164, 0.1282, 0.1241],
        [0.1254, 0.1257, 0.1100, 0.1320, 0.1310, 0.1225, 0.1322, 0.1213],
        [0.1232, 0.1038, 0.0858, 0.1219, 0.1451, 0.1368, 0.1392, 0.1443],
        [0.1175, 0.1477, 0.1260, 0.1198, 0.1192, 0.1217, 0.1309, 0.1171],
        [0.1311, 0.1307, 0.1149, 0.1166, 0.1342, 0.1240, 0.1257, 0.1229],
        [0.1331, 0.1040, 0.0903, 0.1362, 0.1402, 0.1400, 0.1243, 0.1319],
        [0.1440, 0.1004, 0.0927, 0.1352, 0.1290, 0.1369, 0.1310, 0.1306],
        [0.1136, 0.1378, 0.1176, 0.1291, 0.1245, 0.1392, 0.1133, 0.1249],
        [0.1243, 0.1216, 0.1068, 0.1204, 0.1191, 0.1409, 0.1287, 0.1382],
        [0.1303, 0.1020, 0.0862, 0.1283, 0.1401, 0.1552, 0.1132, 0.1447],
        [0.1355, 0.0950, 0.0858, 0.1256, 0.1335, 0.1477, 0.1327, 0.1441],
        [0.1437, 0.0886, 0.0791, 0.1026, 0.1347, 0.1396, 0.1536, 0.1581]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 20 [   0/390]  Loss: 0.5061 (0.506)  Acc@1: 85.9375 (85.9375)  Acc@5: 98.4375 (98.4375)LR: 1.671e-02
Train: 20 [  50/390]  Loss: 0.2743 (0.370)  Acc@1: 90.6250 (87.2549)  Acc@5: 100.0000 (99.5711)LR: 1.671e-02
Train: 20 [ 100/390]  Loss: 0.4377 (0.359)  Acc@1: 89.0625 (87.5464)  Acc@5: 100.0000 (99.7061)LR: 1.671e-02
Train: 20 [ 150/390]  Loss: 0.4190 (0.366)  Acc@1: 87.5000 (87.1999)  Acc@5: 100.0000 (99.6275)LR: 1.671e-02
Train: 20 [ 200/390]  Loss: 0.4013 (0.372)  Acc@1: 84.3750 (87.1502)  Acc@5: 100.0000 (99.5414)LR: 1.671e-02
Train: 20 [ 250/390]  Loss: 0.5101 (0.374)  Acc@1: 84.3750 (87.1016)  Acc@5: 98.4375 (99.5705)LR: 1.671e-02
Train: 20 [ 300/390]  Loss: 0.5499 (0.379)  Acc@1: 79.6875 (86.8563)  Acc@5: 98.4375 (99.5743)LR: 1.671e-02
Train: 20 [ 350/390]  Loss: 0.3420 (0.377)  Acc@1: 89.0625 (86.8990)  Acc@5: 100.0000 (99.5905)LR: 1.671e-02
Train: 20 [ 390/390]  Loss: 0.5310 (0.378)  Acc@1: 80.0000 (86.7600)  Acc@5: 100.0000 (99.5960)LR: 1.671e-02
train_acc 86.760000
Valid: 20 [   0/390]  Loss: 0.7770 (0.777)  Acc@1: 82.8125 (82.8125)  Acc@5: 96.8750 (96.8750)
Valid: 20 [  50/390]  Loss: 0.7716 (0.516)  Acc@1: 76.5625 (83.2721)  Acc@5: 96.8750 (99.0809)
Valid: 20 [ 100/390]  Loss: 0.3454 (0.516)  Acc@1: 87.5000 (83.0600)  Acc@5: 98.4375 (99.0873)
Valid: 20 [ 150/390]  Loss: 0.6408 (0.519)  Acc@1: 79.6875 (83.1126)  Acc@5: 96.8750 (99.1618)
Valid: 20 [ 200/390]  Loss: 0.3651 (0.513)  Acc@1: 85.9375 (83.3489)  Acc@5: 100.0000 (99.1760)
Valid: 20 [ 250/390]  Loss: 0.5381 (0.513)  Acc@1: 82.8125 (83.4288)  Acc@5: 96.8750 (99.1596)
Valid: 20 [ 300/390]  Loss: 0.4396 (0.511)  Acc@1: 87.5000 (83.4614)  Acc@5: 100.0000 (99.1331)
Valid: 20 [ 350/390]  Loss: 0.3079 (0.508)  Acc@1: 87.5000 (83.4669)  Acc@5: 100.0000 (99.1765)
Valid: 20 [ 390/390]  Loss: 0.5091 (0.509)  Acc@1: 87.5000 (83.4480)  Acc@5: 100.0000 (99.1680)
valid_acc 83.448000
epoch = 20   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_3x3', 3), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1483, 0.0914, 0.0665, 0.1042, 0.1742, 0.1420, 0.1410, 0.1323],
        [0.1466, 0.0812, 0.0611, 0.0881, 0.1898, 0.1623, 0.1329, 0.1380],
        [0.1655, 0.0971, 0.0709, 0.1095, 0.1729, 0.1327, 0.1143, 0.1371],
        [0.1627, 0.0890, 0.0704, 0.1033, 0.1439, 0.1573, 0.1377, 0.1357],
        [0.1815, 0.0735, 0.0593, 0.0977, 0.1487, 0.1554, 0.1432, 0.1408],
        [0.1936, 0.0957, 0.0720, 0.1124, 0.1352, 0.1301, 0.1325, 0.1285],
        [0.1921, 0.0848, 0.0698, 0.0996, 0.1613, 0.1399, 0.1239, 0.1286],
        [0.2308, 0.0686, 0.0591, 0.1001, 0.1393, 0.1309, 0.1296, 0.1415],
        [0.2801, 0.0624, 0.0562, 0.0841, 0.1230, 0.1440, 0.1345, 0.1156],
        [0.2100, 0.0933, 0.0726, 0.1082, 0.1344, 0.1233, 0.1211, 0.1371],
        [0.2566, 0.0801, 0.0659, 0.0911, 0.1453, 0.1272, 0.1158, 0.1180],
        [0.2292, 0.0640, 0.0544, 0.0844, 0.1480, 0.1356, 0.1473, 0.1372],
        [0.2688, 0.0583, 0.0520, 0.0713, 0.1280, 0.1295, 0.1532, 0.1389],
        [0.2878, 0.0516, 0.0467, 0.0543, 0.1305, 0.1321, 0.1458, 0.1513]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1152, 0.1382, 0.1135, 0.1158, 0.1439, 0.1438, 0.1121, 0.1176],
        [0.1253, 0.1228, 0.1030, 0.1286, 0.1420, 0.1202, 0.1353, 0.1227],
        [0.1123, 0.1367, 0.1131, 0.1334, 0.1344, 0.1169, 0.1293, 0.1239],
        [0.1256, 0.1264, 0.1103, 0.1321, 0.1311, 0.1214, 0.1320, 0.1211],
        [0.1227, 0.1027, 0.0852, 0.1227, 0.1459, 0.1372, 0.1392, 0.1443],
        [0.1170, 0.1481, 0.1252, 0.1207, 0.1205, 0.1214, 0.1306, 0.1165],
        [0.1306, 0.1318, 0.1163, 0.1164, 0.1341, 0.1231, 0.1245, 0.1233],
        [0.1333, 0.1030, 0.0902, 0.1379, 0.1410, 0.1400, 0.1232, 0.1314],
        [0.1443, 0.0988, 0.0921, 0.1366, 0.1294, 0.1377, 0.1302, 0.1309],
        [0.1138, 0.1378, 0.1169, 0.1292, 0.1258, 0.1398, 0.1127, 0.1241],
        [0.1242, 0.1216, 0.1069, 0.1205, 0.1193, 0.1398, 0.1290, 0.1388],
        [0.1311, 0.1007, 0.0856, 0.1298, 0.1399, 0.1559, 0.1123, 0.1448],
        [0.1358, 0.0934, 0.0847, 0.1263, 0.1349, 0.1487, 0.1335, 0.1427],
        [0.1441, 0.0868, 0.0780, 0.1023, 0.1352, 0.1397, 0.1550, 0.1588]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 21 [   0/390]  Loss: 0.3951 (0.395)  Acc@1: 90.6250 (90.6250)  Acc@5: 100.0000 (100.0000)LR: 1.598e-02
Train: 21 [  50/390]  Loss: 0.3831 (0.347)  Acc@1: 89.0625 (88.3578)  Acc@5: 96.8750 (99.6936)LR: 1.598e-02
Train: 21 [ 100/390]  Loss: 0.3793 (0.337)  Acc@1: 87.5000 (88.5056)  Acc@5: 100.0000 (99.6906)LR: 1.598e-02
Train: 21 [ 150/390]  Loss: 0.1899 (0.345)  Acc@1: 93.7500 (88.2347)  Acc@5: 100.0000 (99.6068)LR: 1.598e-02
Train: 21 [ 200/390]  Loss: 0.5066 (0.349)  Acc@1: 84.3750 (88.0208)  Acc@5: 96.8750 (99.5880)LR: 1.598e-02
Train: 21 [ 250/390]  Loss: 0.3229 (0.355)  Acc@1: 87.5000 (87.8237)  Acc@5: 100.0000 (99.5580)LR: 1.598e-02
Train: 21 [ 300/390]  Loss: 0.4984 (0.362)  Acc@1: 78.1250 (87.5623)  Acc@5: 100.0000 (99.5640)LR: 1.598e-02
Train: 21 [ 350/390]  Loss: 0.2843 (0.367)  Acc@1: 92.1875 (87.3442)  Acc@5: 98.4375 (99.5370)LR: 1.598e-02
Train: 21 [ 390/390]  Loss: 0.3698 (0.368)  Acc@1: 85.0000 (87.2280)  Acc@5: 100.0000 (99.5320)LR: 1.598e-02
train_acc 87.228000
Valid: 21 [   0/390]  Loss: 0.5073 (0.507)  Acc@1: 82.8125 (82.8125)  Acc@5: 100.0000 (100.0000)
Valid: 21 [  50/390]  Loss: 0.3301 (0.482)  Acc@1: 92.1875 (84.4363)  Acc@5: 96.8750 (99.0196)
Valid: 21 [ 100/390]  Loss: 0.4183 (0.458)  Acc@1: 87.5000 (85.1330)  Acc@5: 100.0000 (99.1491)
Valid: 21 [ 150/390]  Loss: 0.4201 (0.466)  Acc@1: 85.9375 (84.8406)  Acc@5: 100.0000 (99.2239)
Valid: 21 [ 200/390]  Loss: 0.7087 (0.465)  Acc@1: 82.8125 (84.7870)  Acc@5: 98.4375 (99.2149)
Valid: 21 [ 250/390]  Loss: 0.5086 (0.463)  Acc@1: 85.9375 (84.9041)  Acc@5: 100.0000 (99.2530)
Valid: 21 [ 300/390]  Loss: 0.4978 (0.462)  Acc@1: 89.0625 (84.8993)  Acc@5: 100.0000 (99.2681)
Valid: 21 [ 350/390]  Loss: 0.4686 (0.461)  Acc@1: 87.5000 (84.8113)  Acc@5: 95.3125 (99.2967)
Valid: 21 [ 390/390]  Loss: 0.5379 (0.464)  Acc@1: 77.5000 (84.7960)  Acc@5: 100.0000 (99.2760)
valid_acc 84.796000
epoch = 21   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 2), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_3x3', 3), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1496, 0.0891, 0.0652, 0.1040, 0.1775, 0.1404, 0.1419, 0.1322],
        [0.1484, 0.0787, 0.0594, 0.0866, 0.1927, 0.1635, 0.1319, 0.1388],
        [0.1675, 0.0959, 0.0702, 0.1099, 0.1739, 0.1337, 0.1131, 0.1358],
        [0.1653, 0.0870, 0.0690, 0.1027, 0.1471, 0.1578, 0.1365, 0.1345],
        [0.1840, 0.0715, 0.0578, 0.0965, 0.1484, 0.1591, 0.1432, 0.1395],
        [0.1975, 0.0948, 0.0717, 0.1136, 0.1343, 0.1273, 0.1315, 0.1293],
        [0.1971, 0.0830, 0.0686, 0.0994, 0.1628, 0.1392, 0.1236, 0.1264],
        [0.2393, 0.0667, 0.0577, 0.0997, 0.1381, 0.1292, 0.1282, 0.1412],
        [0.2926, 0.0603, 0.0546, 0.0829, 0.1213, 0.1423, 0.1315, 0.1145],
        [0.2156, 0.0914, 0.0716, 0.1080, 0.1340, 0.1223, 0.1214, 0.1357],
        [0.2655, 0.0779, 0.0644, 0.0900, 0.1464, 0.1253, 0.1147, 0.1159],
        [0.2378, 0.0623, 0.0531, 0.0836, 0.1470, 0.1338, 0.1460, 0.1364],
        [0.2774, 0.0564, 0.0508, 0.0703, 0.1268, 0.1282, 0.1521, 0.1380],
        [0.2978, 0.0496, 0.0450, 0.0526, 0.1291, 0.1307, 0.1441, 0.1511]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1138, 0.1386, 0.1132, 0.1165, 0.1434, 0.1451, 0.1117, 0.1177],
        [0.1259, 0.1211, 0.1017, 0.1285, 0.1436, 0.1207, 0.1355, 0.1229],
        [0.1114, 0.1378, 0.1134, 0.1336, 0.1359, 0.1157, 0.1290, 0.1232],
        [0.1261, 0.1253, 0.1093, 0.1319, 0.1318, 0.1212, 0.1325, 0.1218],
        [0.1229, 0.1021, 0.0851, 0.1237, 0.1454, 0.1368, 0.1386, 0.1453],
        [0.1158, 0.1488, 0.1250, 0.1213, 0.1215, 0.1213, 0.1301, 0.1163],
        [0.1313, 0.1312, 0.1164, 0.1165, 0.1345, 0.1230, 0.1246, 0.1224],
        [0.1324, 0.1025, 0.0906, 0.1386, 0.1431, 0.1389, 0.1228, 0.1312],
        [0.1446, 0.0983, 0.0921, 0.1378, 0.1283, 0.1373, 0.1304, 0.1311],
        [0.1129, 0.1381, 0.1170, 0.1293, 0.1265, 0.1407, 0.1124, 0.1232],
        [0.1248, 0.1207, 0.1065, 0.1201, 0.1203, 0.1400, 0.1294, 0.1383],
        [0.1312, 0.0993, 0.0849, 0.1301, 0.1397, 0.1567, 0.1130, 0.1452],
        [0.1367, 0.0922, 0.0841, 0.1268, 0.1356, 0.1480, 0.1337, 0.1428],
        [0.1460, 0.0855, 0.0774, 0.1024, 0.1368, 0.1395, 0.1520, 0.1603]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 22 [   0/390]  Loss: 0.5549 (0.555)  Acc@1: 79.6875 (79.6875)  Acc@5: 98.4375 (98.4375)LR: 1.525e-02
Train: 22 [  50/390]  Loss: 0.3349 (0.335)  Acc@1: 89.0625 (88.2966)  Acc@5: 98.4375 (99.7549)LR: 1.525e-02
Train: 22 [ 100/390]  Loss: 0.3713 (0.342)  Acc@1: 89.0625 (88.0415)  Acc@5: 98.4375 (99.6597)LR: 1.525e-02
Train: 22 [ 150/390]  Loss: 0.2334 (0.339)  Acc@1: 93.7500 (88.2036)  Acc@5: 100.0000 (99.7103)LR: 1.525e-02
Train: 22 [ 200/390]  Loss: 0.3350 (0.342)  Acc@1: 89.0625 (88.0675)  Acc@5: 100.0000 (99.6813)LR: 1.525e-02
Train: 22 [ 250/390]  Loss: 0.3854 (0.342)  Acc@1: 92.1875 (88.0727)  Acc@5: 96.8750 (99.6763)LR: 1.525e-02
Train: 22 [ 300/390]  Loss: 0.5404 (0.345)  Acc@1: 81.2500 (87.9360)  Acc@5: 100.0000 (99.6730)LR: 1.525e-02
Train: 22 [ 350/390]  Loss: 0.3141 (0.345)  Acc@1: 87.5000 (87.9407)  Acc@5: 100.0000 (99.6617)LR: 1.525e-02
Train: 22 [ 390/390]  Loss: 0.3441 (0.346)  Acc@1: 80.0000 (87.9000)  Acc@5: 100.0000 (99.6720)LR: 1.525e-02
train_acc 87.900000
Valid: 22 [   0/390]  Loss: 0.5986 (0.599)  Acc@1: 85.9375 (85.9375)  Acc@5: 98.4375 (98.4375)
Valid: 22 [  50/390]  Loss: 0.4082 (0.456)  Acc@1: 81.2500 (84.1605)  Acc@5: 98.4375 (99.3260)
Valid: 22 [ 100/390]  Loss: 0.7137 (0.460)  Acc@1: 76.5625 (84.5297)  Acc@5: 96.8750 (99.1646)
Valid: 22 [ 150/390]  Loss: 0.3033 (0.456)  Acc@1: 89.0625 (84.7579)  Acc@5: 100.0000 (99.2860)
Valid: 22 [ 200/390]  Loss: 0.4265 (0.457)  Acc@1: 87.5000 (84.9036)  Acc@5: 96.8750 (99.2304)
Valid: 22 [ 250/390]  Loss: 0.4076 (0.464)  Acc@1: 78.1250 (84.7112)  Acc@5: 100.0000 (99.2281)
Valid: 22 [ 300/390]  Loss: 0.3296 (0.464)  Acc@1: 87.5000 (84.6190)  Acc@5: 100.0000 (99.2317)
Valid: 22 [ 350/390]  Loss: 0.5717 (0.465)  Acc@1: 81.2500 (84.6332)  Acc@5: 98.4375 (99.2521)
Valid: 22 [ 390/390]  Loss: 0.6057 (0.464)  Acc@1: 80.0000 (84.7560)  Acc@5: 97.5000 (99.2240)
valid_acc 84.756000
epoch = 22   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 2), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('dil_conv_3x3', 3), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1502, 0.0865, 0.0637, 0.1031, 0.1810, 0.1405, 0.1420, 0.1330],
        [0.1501, 0.0760, 0.0573, 0.0844, 0.1967, 0.1651, 0.1309, 0.1396],
        [0.1698, 0.0943, 0.0693, 0.1106, 0.1763, 0.1320, 0.1125, 0.1352],
        [0.1697, 0.0849, 0.0674, 0.1022, 0.1488, 0.1572, 0.1354, 0.1344],
        [0.1866, 0.0699, 0.0568, 0.0963, 0.1481, 0.1599, 0.1433, 0.1392],
        [0.2016, 0.0933, 0.0709, 0.1146, 0.1337, 0.1264, 0.1311, 0.1283],
        [0.2043, 0.0809, 0.0670, 0.0992, 0.1639, 0.1375, 0.1219, 0.1252],
        [0.2461, 0.0649, 0.0564, 0.0995, 0.1362, 0.1289, 0.1274, 0.1406],
        [0.3064, 0.0576, 0.0527, 0.0810, 0.1190, 0.1405, 0.1296, 0.1131],
        [0.2234, 0.0890, 0.0704, 0.1077, 0.1305, 0.1226, 0.1207, 0.1358],
        [0.2767, 0.0753, 0.0623, 0.0884, 0.1461, 0.1237, 0.1130, 0.1145],
        [0.2454, 0.0604, 0.0515, 0.0827, 0.1463, 0.1319, 0.1461, 0.1358],
        [0.2902, 0.0538, 0.0487, 0.0684, 0.1249, 0.1246, 0.1507, 0.1387],
        [0.3113, 0.0473, 0.0431, 0.0507, 0.1267, 0.1292, 0.1415, 0.1503]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1133, 0.1386, 0.1122, 0.1169, 0.1425, 0.1458, 0.1127, 0.1180],
        [0.1254, 0.1197, 0.1012, 0.1290, 0.1439, 0.1214, 0.1359, 0.1234],
        [0.1110, 0.1389, 0.1130, 0.1333, 0.1355, 0.1154, 0.1297, 0.1234],
        [0.1260, 0.1240, 0.1086, 0.1315, 0.1327, 0.1211, 0.1334, 0.1227],
        [0.1232, 0.1008, 0.0851, 0.1252, 0.1444, 0.1369, 0.1384, 0.1460],
        [0.1158, 0.1487, 0.1243, 0.1214, 0.1223, 0.1212, 0.1318, 0.1145],
        [0.1311, 0.1306, 0.1169, 0.1176, 0.1343, 0.1228, 0.1251, 0.1215],
        [0.1315, 0.1009, 0.0903, 0.1395, 0.1452, 0.1382, 0.1233, 0.1312],
        [0.1460, 0.0965, 0.0912, 0.1378, 0.1276, 0.1376, 0.1309, 0.1324],
        [0.1125, 0.1383, 0.1162, 0.1292, 0.1276, 0.1423, 0.1114, 0.1225],
        [0.1242, 0.1206, 0.1073, 0.1204, 0.1197, 0.1405, 0.1297, 0.1377],
        [0.1317, 0.0978, 0.0848, 0.1315, 0.1408, 0.1566, 0.1131, 0.1436],
        [0.1371, 0.0905, 0.0834, 0.1268, 0.1372, 0.1478, 0.1343, 0.1430],
        [0.1475, 0.0837, 0.0766, 0.1022, 0.1375, 0.1390, 0.1525, 0.1609]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 23 [   0/390]  Loss: 0.4715 (0.472)  Acc@1: 82.8125 (82.8125)  Acc@5: 98.4375 (98.4375)LR: 1.450e-02
Train: 23 [  50/390]  Loss: 0.2587 (0.332)  Acc@1: 89.0625 (88.1434)  Acc@5: 100.0000 (99.7243)LR: 1.450e-02
Train: 23 [ 100/390]  Loss: 0.3037 (0.331)  Acc@1: 90.6250 (88.4282)  Acc@5: 100.0000 (99.7525)LR: 1.450e-02
Train: 23 [ 150/390]  Loss: 0.4436 (0.341)  Acc@1: 87.5000 (87.9863)  Acc@5: 100.0000 (99.6999)LR: 1.450e-02
Train: 23 [ 200/390]  Loss: 0.3615 (0.337)  Acc@1: 85.9375 (88.0364)  Acc@5: 100.0000 (99.6968)LR: 1.450e-02
Train: 23 [ 250/390]  Loss: 0.3470 (0.337)  Acc@1: 89.0625 (88.1225)  Acc@5: 98.4375 (99.6950)LR: 1.450e-02
Train: 23 [ 300/390]  Loss: 0.3398 (0.335)  Acc@1: 87.5000 (88.2475)  Acc@5: 100.0000 (99.6937)LR: 1.450e-02
Train: 23 [ 350/390]  Loss: 0.3659 (0.338)  Acc@1: 90.6250 (88.1722)  Acc@5: 98.4375 (99.6839)LR: 1.450e-02
Train: 23 [ 390/390]  Loss: 0.5285 (0.340)  Acc@1: 82.5000 (88.1440)  Acc@5: 97.5000 (99.6680)LR: 1.450e-02
train_acc 88.144000
Valid: 23 [   0/390]  Loss: 0.3460 (0.346)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)
Valid: 23 [  50/390]  Loss: 0.6720 (0.468)  Acc@1: 79.6875 (84.5588)  Acc@5: 98.4375 (99.1422)
Valid: 23 [ 100/390]  Loss: 0.3903 (0.462)  Acc@1: 87.5000 (84.8082)  Acc@5: 100.0000 (99.3193)
Valid: 23 [ 150/390]  Loss: 0.3318 (0.465)  Acc@1: 87.5000 (84.7889)  Acc@5: 100.0000 (99.3067)
Valid: 23 [ 200/390]  Loss: 0.4291 (0.466)  Acc@1: 85.9375 (84.7015)  Acc@5: 100.0000 (99.3004)
Valid: 23 [ 250/390]  Loss: 0.4216 (0.465)  Acc@1: 87.5000 (84.7174)  Acc@5: 100.0000 (99.3215)
Valid: 23 [ 300/390]  Loss: 0.8518 (0.470)  Acc@1: 79.6875 (84.5878)  Acc@5: 98.4375 (99.2629)
Valid: 23 [ 350/390]  Loss: 0.4936 (0.468)  Acc@1: 82.8125 (84.6465)  Acc@5: 98.4375 (99.3011)
Valid: 23 [ 390/390]  Loss: 0.1736 (0.467)  Acc@1: 95.0000 (84.6880)  Acc@5: 100.0000 (99.3160)
valid_acc 84.688000
epoch = 23   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 2), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('dil_conv_5x5', 4), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1518, 0.0842, 0.0624, 0.1027, 0.1814, 0.1403, 0.1424, 0.1347],
        [0.1514, 0.0735, 0.0556, 0.0825, 0.1993, 0.1662, 0.1306, 0.1407],
        [0.1729, 0.0922, 0.0682, 0.1107, 0.1792, 0.1320, 0.1113, 0.1335],
        [0.1740, 0.0824, 0.0661, 0.1016, 0.1506, 0.1563, 0.1352, 0.1339],
        [0.1892, 0.0678, 0.0552, 0.0952, 0.1482, 0.1603, 0.1441, 0.1400],
        [0.2069, 0.0911, 0.0698, 0.1151, 0.1329, 0.1256, 0.1293, 0.1291],
        [0.2122, 0.0786, 0.0655, 0.0983, 0.1658, 0.1352, 0.1203, 0.1241],
        [0.2552, 0.0626, 0.0549, 0.0984, 0.1354, 0.1277, 0.1257, 0.1401],
        [0.3201, 0.0552, 0.0507, 0.0786, 0.1159, 0.1393, 0.1275, 0.1126],
        [0.2313, 0.0865, 0.0691, 0.1074, 0.1282, 0.1219, 0.1203, 0.1353],
        [0.2885, 0.0721, 0.0601, 0.0863, 0.1446, 0.1235, 0.1120, 0.1128],
        [0.2547, 0.0582, 0.0500, 0.0820, 0.1453, 0.1292, 0.1460, 0.1345],
        [0.3025, 0.0517, 0.0470, 0.0668, 0.1244, 0.1217, 0.1487, 0.1371],
        [0.3241, 0.0452, 0.0414, 0.0490, 0.1242, 0.1279, 0.1392, 0.1490]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1129, 0.1379, 0.1113, 0.1165, 0.1414, 0.1477, 0.1133, 0.1190],
        [0.1252, 0.1180, 0.0999, 0.1300, 0.1448, 0.1224, 0.1359, 0.1237],
        [0.1105, 0.1382, 0.1123, 0.1354, 0.1357, 0.1152, 0.1302, 0.1225],
        [0.1264, 0.1225, 0.1077, 0.1306, 0.1335, 0.1208, 0.1333, 0.1253],
        [0.1224, 0.0990, 0.0844, 0.1246, 0.1457, 0.1371, 0.1394, 0.1475],
        [0.1159, 0.1472, 0.1233, 0.1223, 0.1219, 0.1214, 0.1339, 0.1142],
        [0.1325, 0.1306, 0.1171, 0.1169, 0.1340, 0.1231, 0.1246, 0.1213],
        [0.1313, 0.0995, 0.0903, 0.1403, 0.1463, 0.1383, 0.1229, 0.1311],
        [0.1441, 0.0960, 0.0909, 0.1384, 0.1276, 0.1395, 0.1297, 0.1338],
        [0.1124, 0.1370, 0.1154, 0.1298, 0.1291, 0.1433, 0.1105, 0.1224],
        [0.1245, 0.1197, 0.1071, 0.1215, 0.1200, 0.1401, 0.1301, 0.1371],
        [0.1326, 0.0967, 0.0848, 0.1329, 0.1413, 0.1552, 0.1131, 0.1434],
        [0.1363, 0.0897, 0.0831, 0.1267, 0.1376, 0.1481, 0.1348, 0.1437],
        [0.1487, 0.0827, 0.0762, 0.1022, 0.1389, 0.1396, 0.1509, 0.1607]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 24 [   0/390]  Loss: 0.3320 (0.332)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)LR: 1.375e-02
Train: 24 [  50/390]  Loss: 0.3102 (0.324)  Acc@1: 92.1875 (89.0931)  Acc@5: 100.0000 (99.6017)LR: 1.375e-02
Train: 24 [ 100/390]  Loss: 0.4146 (0.317)  Acc@1: 90.6250 (89.3564)  Acc@5: 100.0000 (99.6287)LR: 1.375e-02
Train: 24 [ 150/390]  Loss: 0.2985 (0.332)  Acc@1: 90.6250 (88.9073)  Acc@5: 98.4375 (99.6171)LR: 1.375e-02
Train: 24 [ 200/390]  Loss: 0.2727 (0.331)  Acc@1: 92.1875 (88.7982)  Acc@5: 100.0000 (99.6891)LR: 1.375e-02
Train: 24 [ 250/390]  Loss: 0.2269 (0.332)  Acc@1: 90.6250 (88.6081)  Acc@5: 100.0000 (99.7199)LR: 1.375e-02
Train: 24 [ 300/390]  Loss: 0.3690 (0.333)  Acc@1: 84.3750 (88.5174)  Acc@5: 98.4375 (99.6989)LR: 1.375e-02
Train: 24 [ 350/390]  Loss: 0.3240 (0.334)  Acc@1: 85.9375 (88.4081)  Acc@5: 100.0000 (99.6750)LR: 1.375e-02
Train: 24 [ 390/390]  Loss: 0.5707 (0.337)  Acc@1: 77.5000 (88.2080)  Acc@5: 100.0000 (99.6680)LR: 1.375e-02
train_acc 88.208000
Valid: 24 [   0/390]  Loss: 0.6276 (0.628)  Acc@1: 84.3750 (84.3750)  Acc@5: 100.0000 (100.0000)
Valid: 24 [  50/390]  Loss: 0.2396 (0.432)  Acc@1: 92.1875 (85.2328)  Acc@5: 100.0000 (99.5404)
Valid: 24 [ 100/390]  Loss: 0.5292 (0.426)  Acc@1: 82.8125 (85.1485)  Acc@5: 100.0000 (99.5978)
Valid: 24 [ 150/390]  Loss: 0.4133 (0.434)  Acc@1: 84.3750 (84.9441)  Acc@5: 98.4375 (99.5654)
Valid: 24 [ 200/390]  Loss: 0.3770 (0.426)  Acc@1: 85.9375 (85.2767)  Acc@5: 100.0000 (99.5647)
Valid: 24 [ 250/390]  Loss: 0.4684 (0.429)  Acc@1: 87.5000 (85.2776)  Acc@5: 100.0000 (99.5269)
Valid: 24 [ 300/390]  Loss: 0.2785 (0.434)  Acc@1: 92.1875 (85.1900)  Acc@5: 100.0000 (99.4601)
Valid: 24 [ 350/390]  Loss: 0.4687 (0.436)  Acc@1: 82.8125 (85.1807)  Acc@5: 100.0000 (99.4436)
Valid: 24 [ 390/390]  Loss: 0.7161 (0.439)  Acc@1: 72.5000 (85.1400)  Acc@5: 97.5000 (99.4080)
valid_acc 85.140000
epoch = 24   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 2), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('dil_conv_5x5', 4), ('dil_conv_3x3', 3)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1532, 0.0828, 0.0610, 0.1027, 0.1819, 0.1391, 0.1437, 0.1356],
        [0.1531, 0.0717, 0.0543, 0.0817, 0.2001, 0.1684, 0.1298, 0.1409],
        [0.1759, 0.0911, 0.0669, 0.1110, 0.1804, 0.1317, 0.1108, 0.1324],
        [0.1759, 0.0812, 0.0649, 0.1020, 0.1520, 0.1558, 0.1341, 0.1341],
        [0.1942, 0.0654, 0.0534, 0.0940, 0.1473, 0.1615, 0.1449, 0.1392],
        [0.2125, 0.0900, 0.0686, 0.1160, 0.1305, 0.1250, 0.1293, 0.1281],
        [0.2176, 0.0773, 0.0645, 0.0987, 0.1663, 0.1342, 0.1184, 0.1230],
        [0.2638, 0.0607, 0.0532, 0.0975, 0.1348, 0.1274, 0.1244, 0.1382],
        [0.3335, 0.0531, 0.0489, 0.0768, 0.1124, 0.1382, 0.1250, 0.1121],
        [0.2390, 0.0847, 0.0677, 0.1072, 0.1274, 0.1207, 0.1202, 0.1330],
        [0.2996, 0.0700, 0.0585, 0.0855, 0.1435, 0.1205, 0.1107, 0.1117],
        [0.2653, 0.0558, 0.0483, 0.0807, 0.1431, 0.1265, 0.1465, 0.1338],
        [0.3158, 0.0494, 0.0451, 0.0648, 0.1238, 0.1186, 0.1473, 0.1352],
        [0.3374, 0.0430, 0.0396, 0.0473, 0.1219, 0.1260, 0.1363, 0.1484]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1131, 0.1383, 0.1109, 0.1151, 0.1419, 0.1483, 0.1135, 0.1188],
        [0.1243, 0.1178, 0.0994, 0.1312, 0.1455, 0.1214, 0.1371, 0.1234],
        [0.1106, 0.1378, 0.1113, 0.1358, 0.1368, 0.1145, 0.1305, 0.1227],
        [0.1258, 0.1231, 0.1081, 0.1300, 0.1342, 0.1197, 0.1335, 0.1255],
        [0.1219, 0.0977, 0.0842, 0.1251, 0.1468, 0.1373, 0.1387, 0.1484],
        [0.1151, 0.1475, 0.1231, 0.1236, 0.1218, 0.1218, 0.1329, 0.1142],
        [0.1321, 0.1303, 0.1167, 0.1170, 0.1348, 0.1235, 0.1248, 0.1206],
        [0.1318, 0.0978, 0.0902, 0.1423, 0.1471, 0.1379, 0.1222, 0.1306],
        [0.1440, 0.0945, 0.0903, 0.1395, 0.1286, 0.1400, 0.1291, 0.1340],
        [0.1123, 0.1367, 0.1145, 0.1311, 0.1305, 0.1439, 0.1090, 0.1220],
        [0.1244, 0.1202, 0.1075, 0.1225, 0.1194, 0.1383, 0.1306, 0.1370],
        [0.1328, 0.0961, 0.0848, 0.1343, 0.1430, 0.1534, 0.1130, 0.1426],
        [0.1369, 0.0888, 0.0829, 0.1278, 0.1388, 0.1471, 0.1333, 0.1444],
        [0.1497, 0.0816, 0.0756, 0.1024, 0.1417, 0.1383, 0.1500, 0.1607]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 25 [   0/390]  Loss: 0.4159 (0.416)  Acc@1: 79.6875 (79.6875)  Acc@5: 100.0000 (100.0000)LR: 1.300e-02
Train: 25 [  50/390]  Loss: 0.5772 (0.345)  Acc@1: 78.1250 (87.2243)  Acc@5: 100.0000 (99.7243)LR: 1.300e-02
Train: 25 [ 100/390]  Loss: 0.3537 (0.336)  Acc@1: 84.3750 (87.9950)  Acc@5: 100.0000 (99.7370)LR: 1.300e-02
Train: 25 [ 150/390]  Loss: 0.3374 (0.328)  Acc@1: 89.0625 (88.3485)  Acc@5: 100.0000 (99.7206)LR: 1.300e-02
Train: 25 [ 200/390]  Loss: 0.3673 (0.324)  Acc@1: 85.9375 (88.4873)  Acc@5: 98.4375 (99.7201)LR: 1.300e-02
Train: 25 [ 250/390]  Loss: 0.2746 (0.324)  Acc@1: 89.0625 (88.5022)  Acc@5: 100.0000 (99.6950)LR: 1.300e-02
Train: 25 [ 300/390]  Loss: 0.1634 (0.322)  Acc@1: 93.7500 (88.5745)  Acc@5: 100.0000 (99.6678)LR: 1.300e-02
Train: 25 [ 350/390]  Loss: 0.3934 (0.324)  Acc@1: 84.3750 (88.4927)  Acc@5: 100.0000 (99.6839)LR: 1.300e-02
Train: 25 [ 390/390]  Loss: 0.4612 (0.324)  Acc@1: 82.5000 (88.5360)  Acc@5: 100.0000 (99.6880)LR: 1.300e-02
train_acc 88.536000
Valid: 25 [   0/390]  Loss: 0.6442 (0.644)  Acc@1: 81.2500 (81.2500)  Acc@5: 96.8750 (96.8750)
Valid: 25 [  50/390]  Loss: 0.4930 (0.407)  Acc@1: 81.2500 (86.0907)  Acc@5: 100.0000 (99.3873)
Valid: 25 [ 100/390]  Loss: 0.5719 (0.415)  Acc@1: 81.2500 (85.9375)  Acc@5: 98.4375 (99.4895)
Valid: 25 [ 150/390]  Loss: 0.1458 (0.422)  Acc@1: 93.7500 (85.8754)  Acc@5: 100.0000 (99.4516)
Valid: 25 [ 200/390]  Loss: 0.5475 (0.430)  Acc@1: 82.8125 (85.5955)  Acc@5: 98.4375 (99.3781)
Valid: 25 [ 250/390]  Loss: 0.5790 (0.430)  Acc@1: 85.9375 (85.5889)  Acc@5: 98.4375 (99.3775)
Valid: 25 [ 300/390]  Loss: 0.3240 (0.429)  Acc@1: 84.3750 (85.6468)  Acc@5: 98.4375 (99.3511)
Valid: 25 [ 350/390]  Loss: 0.3866 (0.429)  Acc@1: 82.8125 (85.6303)  Acc@5: 100.0000 (99.3634)
Valid: 25 [ 390/390]  Loss: 0.3725 (0.431)  Acc@1: 85.0000 (85.5320)  Acc@5: 100.0000 (99.3600)
valid_acc 85.532000
epoch = 25   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 2), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1538, 0.0813, 0.0603, 0.1033, 0.1837, 0.1390, 0.1437, 0.1349],
        [0.1545, 0.0701, 0.0534, 0.0811, 0.2007, 0.1715, 0.1294, 0.1393],
        [0.1790, 0.0896, 0.0662, 0.1114, 0.1813, 0.1318, 0.1097, 0.1309],
        [0.1782, 0.0794, 0.0638, 0.1013, 0.1537, 0.1557, 0.1341, 0.1337],
        [0.1978, 0.0635, 0.0524, 0.0936, 0.1473, 0.1610, 0.1454, 0.1389],
        [0.2167, 0.0887, 0.0680, 0.1167, 0.1298, 0.1226, 0.1280, 0.1294],
        [0.2235, 0.0759, 0.0637, 0.0989, 0.1659, 0.1321, 0.1173, 0.1227],
        [0.2740, 0.0591, 0.0522, 0.0974, 0.1334, 0.1241, 0.1234, 0.1363],
        [0.3462, 0.0511, 0.0474, 0.0752, 0.1096, 0.1367, 0.1231, 0.1107],
        [0.2462, 0.0829, 0.0670, 0.1073, 0.1252, 0.1198, 0.1192, 0.1325],
        [0.3119, 0.0680, 0.0574, 0.0847, 0.1422, 0.1172, 0.1089, 0.1098],
        [0.2755, 0.0540, 0.0472, 0.0803, 0.1405, 0.1252, 0.1457, 0.1317],
        [0.3301, 0.0474, 0.0437, 0.0635, 0.1206, 0.1156, 0.1449, 0.1342],
        [0.3499, 0.0410, 0.0383, 0.0461, 0.1197, 0.1231, 0.1334, 0.1485]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1138, 0.1385, 0.1110, 0.1157, 0.1401, 0.1481, 0.1145, 0.1183],
        [0.1230, 0.1176, 0.0990, 0.1325, 0.1455, 0.1221, 0.1375, 0.1228],
        [0.1100, 0.1380, 0.1115, 0.1366, 0.1365, 0.1146, 0.1305, 0.1224],
        [0.1252, 0.1234, 0.1083, 0.1298, 0.1353, 0.1197, 0.1338, 0.1245],
        [0.1224, 0.0964, 0.0842, 0.1256, 0.1472, 0.1378, 0.1386, 0.1477],
        [0.1139, 0.1481, 0.1239, 0.1232, 0.1219, 0.1221, 0.1334, 0.1135],
        [0.1323, 0.1313, 0.1179, 0.1165, 0.1337, 0.1229, 0.1244, 0.1209],
        [0.1322, 0.0957, 0.0902, 0.1430, 0.1479, 0.1370, 0.1227, 0.1313],
        [0.1441, 0.0930, 0.0905, 0.1406, 0.1283, 0.1401, 0.1295, 0.1338],
        [0.1128, 0.1365, 0.1146, 0.1308, 0.1304, 0.1445, 0.1082, 0.1222],
        [0.1238, 0.1200, 0.1076, 0.1241, 0.1197, 0.1376, 0.1303, 0.1370],
        [0.1321, 0.0943, 0.0843, 0.1340, 0.1434, 0.1541, 0.1134, 0.1443],
        [0.1375, 0.0872, 0.0826, 0.1277, 0.1406, 0.1461, 0.1336, 0.1446],
        [0.1522, 0.0805, 0.0754, 0.1024, 0.1431, 0.1371, 0.1498, 0.1596]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 26 [   0/390]  Loss: 0.3260 (0.326)  Acc@1: 92.1875 (92.1875)  Acc@5: 98.4375 (98.4375)LR: 1.225e-02
Train: 26 [  50/390]  Loss: 0.3212 (0.292)  Acc@1: 87.5000 (90.0429)  Acc@5: 100.0000 (99.8162)LR: 1.225e-02
Train: 26 [ 100/390]  Loss: 0.2400 (0.315)  Acc@1: 93.7500 (88.9233)  Acc@5: 100.0000 (99.7370)LR: 1.225e-02
Train: 26 [ 150/390]  Loss: 0.2883 (0.317)  Acc@1: 92.1875 (88.8969)  Acc@5: 98.4375 (99.7413)LR: 1.225e-02
Train: 26 [ 200/390]  Loss: 0.2097 (0.312)  Acc@1: 92.1875 (89.0703)  Acc@5: 100.0000 (99.7823)LR: 1.225e-02
Train: 26 [ 250/390]  Loss: 0.2624 (0.313)  Acc@1: 90.6250 (89.0812)  Acc@5: 98.4375 (99.7448)LR: 1.225e-02
Train: 26 [ 300/390]  Loss: 0.2421 (0.312)  Acc@1: 93.7500 (89.1819)  Acc@5: 100.0000 (99.7404)LR: 1.225e-02
Train: 26 [ 350/390]  Loss: 0.1958 (0.316)  Acc@1: 93.7500 (89.0714)  Acc@5: 100.0000 (99.7463)LR: 1.225e-02
Train: 26 [ 390/390]  Loss: 0.3947 (0.316)  Acc@1: 90.0000 (89.0600)  Acc@5: 100.0000 (99.7520)LR: 1.225e-02
train_acc 89.060000
Valid: 26 [   0/390]  Loss: 0.4533 (0.453)  Acc@1: 84.3750 (84.3750)  Acc@5: 98.4375 (98.4375)
Valid: 26 [  50/390]  Loss: 0.3988 (0.424)  Acc@1: 84.3750 (86.0600)  Acc@5: 100.0000 (99.5098)
Valid: 26 [ 100/390]  Loss: 0.4060 (0.423)  Acc@1: 90.6250 (86.1541)  Acc@5: 100.0000 (99.5050)
Valid: 26 [ 150/390]  Loss: 0.4321 (0.429)  Acc@1: 85.9375 (86.1962)  Acc@5: 100.0000 (99.3895)
Valid: 26 [ 200/390]  Loss: 0.2573 (0.430)  Acc@1: 93.7500 (86.0852)  Acc@5: 98.4375 (99.3626)
Valid: 26 [ 250/390]  Loss: 0.5524 (0.432)  Acc@1: 82.8125 (85.8503)  Acc@5: 98.4375 (99.3713)
Valid: 26 [ 300/390]  Loss: 0.4156 (0.424)  Acc@1: 84.3750 (86.0828)  Acc@5: 96.8750 (99.3667)
Valid: 26 [ 350/390]  Loss: 0.4181 (0.422)  Acc@1: 85.9375 (86.1111)  Acc@5: 100.0000 (99.3723)
Valid: 26 [ 390/390]  Loss: 0.6475 (0.425)  Acc@1: 75.0000 (85.9680)  Acc@5: 100.0000 (99.3680)
valid_acc 85.968000
epoch = 26   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 2), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1550, 0.0792, 0.0592, 0.1032, 0.1871, 0.1383, 0.1431, 0.1350],
        [0.1557, 0.0681, 0.0523, 0.0804, 0.2029, 0.1725, 0.1288, 0.1394],
        [0.1832, 0.0873, 0.0648, 0.1112, 0.1822, 0.1300, 0.1094, 0.1318],
        [0.1794, 0.0779, 0.0632, 0.1021, 0.1560, 0.1554, 0.1346, 0.1314],
        [0.2023, 0.0618, 0.0512, 0.0935, 0.1463, 0.1611, 0.1460, 0.1378],
        [0.2242, 0.0863, 0.0665, 0.1163, 0.1285, 0.1221, 0.1264, 0.1298],
        [0.2279, 0.0742, 0.0630, 0.0993, 0.1682, 0.1311, 0.1161, 0.1202],
        [0.2835, 0.0571, 0.0507, 0.0964, 0.1313, 0.1232, 0.1229, 0.1349],
        [0.3585, 0.0486, 0.0455, 0.0727, 0.1072, 0.1373, 0.1209, 0.1093],
        [0.2561, 0.0802, 0.0655, 0.1068, 0.1235, 0.1185, 0.1189, 0.1307],
        [0.3236, 0.0660, 0.0561, 0.0839, 0.1410, 0.1154, 0.1065, 0.1076],
        [0.2869, 0.0523, 0.0461, 0.0800, 0.1381, 0.1210, 0.1446, 0.1310],
        [0.3447, 0.0452, 0.0422, 0.0619, 0.1165, 0.1127, 0.1440, 0.1328],
        [0.3644, 0.0392, 0.0372, 0.0450, 0.1155, 0.1212, 0.1300, 0.1476]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1136, 0.1381, 0.1099, 0.1154, 0.1407, 0.1498, 0.1147, 0.1177],
        [0.1225, 0.1174, 0.0990, 0.1331, 0.1460, 0.1222, 0.1382, 0.1216],
        [0.1105, 0.1380, 0.1109, 0.1365, 0.1360, 0.1146, 0.1318, 0.1217],
        [0.1247, 0.1242, 0.1089, 0.1296, 0.1354, 0.1197, 0.1339, 0.1236],
        [0.1224, 0.0953, 0.0839, 0.1264, 0.1479, 0.1382, 0.1390, 0.1469],
        [0.1150, 0.1479, 0.1233, 0.1224, 0.1195, 0.1236, 0.1346, 0.1139],
        [0.1317, 0.1329, 0.1201, 0.1162, 0.1326, 0.1222, 0.1232, 0.1210],
        [0.1314, 0.0946, 0.0900, 0.1438, 0.1483, 0.1375, 0.1234, 0.1311],
        [0.1449, 0.0915, 0.0893, 0.1411, 0.1293, 0.1397, 0.1305, 0.1337],
        [0.1133, 0.1354, 0.1132, 0.1305, 0.1304, 0.1461, 0.1077, 0.1234],
        [0.1231, 0.1205, 0.1085, 0.1239, 0.1201, 0.1373, 0.1294, 0.1372],
        [0.1315, 0.0932, 0.0836, 0.1338, 0.1442, 0.1545, 0.1136, 0.1456],
        [0.1368, 0.0854, 0.0811, 0.1264, 0.1451, 0.1461, 0.1330, 0.1462],
        [0.1539, 0.0788, 0.0742, 0.1015, 0.1445, 0.1350, 0.1501, 0.1621]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 27 [   0/390]  Loss: 0.3865 (0.386)  Acc@1: 84.3750 (84.3750)  Acc@5: 100.0000 (100.0000)LR: 1.150e-02
Train: 27 [  50/390]  Loss: 0.5582 (0.320)  Acc@1: 81.2500 (89.1544)  Acc@5: 98.4375 (99.4792)LR: 1.150e-02
Train: 27 [ 100/390]  Loss: 0.3071 (0.335)  Acc@1: 92.1875 (88.5520)  Acc@5: 100.0000 (99.5514)LR: 1.150e-02
Train: 27 [ 150/390]  Loss: 0.2891 (0.328)  Acc@1: 87.5000 (88.8038)  Acc@5: 100.0000 (99.6275)LR: 1.150e-02
Train: 27 [ 200/390]  Loss: 0.1687 (0.319)  Acc@1: 93.7500 (89.0936)  Acc@5: 100.0000 (99.6657)LR: 1.150e-02
Train: 27 [ 250/390]  Loss: 0.2636 (0.314)  Acc@1: 90.6250 (89.2368)  Acc@5: 100.0000 (99.6950)LR: 1.150e-02
Train: 27 [ 300/390]  Loss: 0.3017 (0.317)  Acc@1: 92.1875 (89.1196)  Acc@5: 100.0000 (99.6833)LR: 1.150e-02
Train: 27 [ 350/390]  Loss: 0.2431 (0.318)  Acc@1: 89.0625 (89.0402)  Acc@5: 100.0000 (99.7017)LR: 1.150e-02
Train: 27 [ 390/390]  Loss: 0.2298 (0.323)  Acc@1: 90.0000 (88.8240)  Acc@5: 100.0000 (99.6800)LR: 1.150e-02
train_acc 88.824000
Valid: 27 [   0/390]  Loss: 0.2722 (0.272)  Acc@1: 89.0625 (89.0625)  Acc@5: 100.0000 (100.0000)
Valid: 27 [  50/390]  Loss: 0.3958 (0.410)  Acc@1: 85.9375 (85.7537)  Acc@5: 98.4375 (99.4485)
Valid: 27 [ 100/390]  Loss: 0.4816 (0.436)  Acc@1: 87.5000 (85.1949)  Acc@5: 100.0000 (99.3193)
Valid: 27 [ 150/390]  Loss: 0.5809 (0.435)  Acc@1: 82.8125 (85.2546)  Acc@5: 98.4375 (99.2446)
Valid: 27 [ 200/390]  Loss: 0.3238 (0.432)  Acc@1: 85.9375 (85.3700)  Acc@5: 100.0000 (99.3703)
Valid: 27 [ 250/390]  Loss: 0.6498 (0.426)  Acc@1: 76.5625 (85.4706)  Acc@5: 100.0000 (99.4086)
Valid: 27 [ 300/390]  Loss: 0.4427 (0.423)  Acc@1: 89.0625 (85.6416)  Acc@5: 100.0000 (99.3978)
Valid: 27 [ 350/390]  Loss: 0.5270 (0.422)  Acc@1: 87.5000 (85.7238)  Acc@5: 100.0000 (99.3768)
Valid: 27 [ 390/390]  Loss: 0.2078 (0.422)  Acc@1: 95.0000 (85.7560)  Acc@5: 100.0000 (99.3680)
valid_acc 85.756000
epoch = 27   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_5x5', 2), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('skip_connect', 0), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1557, 0.0768, 0.0585, 0.1034, 0.1903, 0.1384, 0.1426, 0.1343],
        [0.1578, 0.0661, 0.0513, 0.0794, 0.2049, 0.1734, 0.1277, 0.1394],
        [0.1868, 0.0849, 0.0639, 0.1115, 0.1838, 0.1298, 0.1080, 0.1313],
        [0.1822, 0.0762, 0.0624, 0.1014, 0.1587, 0.1547, 0.1339, 0.1305],
        [0.2052, 0.0599, 0.0499, 0.0921, 0.1474, 0.1606, 0.1461, 0.1389],
        [0.2303, 0.0836, 0.0655, 0.1166, 0.1271, 0.1204, 0.1259, 0.1304],
        [0.2335, 0.0720, 0.0619, 0.0985, 0.1705, 0.1293, 0.1148, 0.1195],
        [0.2923, 0.0552, 0.0496, 0.0953, 0.1310, 0.1217, 0.1207, 0.1340],
        [0.3721, 0.0465, 0.0440, 0.0706, 0.1042, 0.1371, 0.1177, 0.1078],
        [0.2658, 0.0774, 0.0642, 0.1064, 0.1215, 0.1176, 0.1184, 0.1288],
        [0.3373, 0.0636, 0.0546, 0.0825, 0.1397, 0.1120, 0.1056, 0.1046],
        [0.3003, 0.0506, 0.0451, 0.0799, 0.1355, 0.1178, 0.1422, 0.1287],
        [0.3580, 0.0433, 0.0408, 0.0605, 0.1137, 0.1104, 0.1416, 0.1317],
        [0.3816, 0.0373, 0.0357, 0.0436, 0.1124, 0.1189, 0.1262, 0.1443]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1133, 0.1384, 0.1092, 0.1158, 0.1413, 0.1500, 0.1146, 0.1174],
        [0.1228, 0.1164, 0.0981, 0.1345, 0.1462, 0.1224, 0.1388, 0.1207],
        [0.1098, 0.1374, 0.1100, 0.1379, 0.1365, 0.1149, 0.1329, 0.1205],
        [0.1249, 0.1228, 0.1084, 0.1300, 0.1361, 0.1191, 0.1343, 0.1245],
        [0.1229, 0.0934, 0.0837, 0.1261, 0.1491, 0.1392, 0.1386, 0.1469],
        [0.1148, 0.1474, 0.1223, 0.1208, 0.1204, 0.1236, 0.1353, 0.1153],
        [0.1324, 0.1322, 0.1202, 0.1161, 0.1328, 0.1230, 0.1233, 0.1201],
        [0.1316, 0.0924, 0.0900, 0.1438, 0.1490, 0.1375, 0.1243, 0.1313],
        [0.1452, 0.0895, 0.0887, 0.1402, 0.1297, 0.1405, 0.1321, 0.1341],
        [0.1132, 0.1348, 0.1124, 0.1313, 0.1305, 0.1464, 0.1077, 0.1237],
        [0.1227, 0.1197, 0.1083, 0.1248, 0.1209, 0.1369, 0.1287, 0.1380],
        [0.1314, 0.0912, 0.0831, 0.1331, 0.1447, 0.1551, 0.1147, 0.1467],
        [0.1369, 0.0842, 0.0808, 0.1264, 0.1449, 0.1465, 0.1329, 0.1474],
        [0.1538, 0.0773, 0.0738, 0.1008, 0.1461, 0.1366, 0.1497, 0.1618]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 28 [   0/390]  Loss: 0.2121 (0.212)  Acc@1: 93.7500 (93.7500)  Acc@5: 100.0000 (100.0000)LR: 1.075e-02
Train: 28 [  50/390]  Loss: 0.2243 (0.312)  Acc@1: 92.1875 (89.1544)  Acc@5: 100.0000 (99.7549)LR: 1.075e-02
Train: 28 [ 100/390]  Loss: 0.3096 (0.316)  Acc@1: 87.5000 (89.0780)  Acc@5: 100.0000 (99.7215)LR: 1.075e-02
Train: 28 [ 150/390]  Loss: 0.3228 (0.313)  Acc@1: 89.0625 (89.0625)  Acc@5: 100.0000 (99.6999)LR: 1.075e-02
Train: 28 [ 200/390]  Loss: 0.2163 (0.308)  Acc@1: 90.6250 (89.2491)  Acc@5: 100.0000 (99.6891)LR: 1.075e-02
Train: 28 [ 250/390]  Loss: 0.3881 (0.314)  Acc@1: 84.3750 (89.0625)  Acc@5: 100.0000 (99.6887)LR: 1.075e-02
Train: 28 [ 300/390]  Loss: 0.2762 (0.315)  Acc@1: 89.0625 (88.9275)  Acc@5: 100.0000 (99.6782)LR: 1.075e-02
Train: 28 [ 350/390]  Loss: 0.1667 (0.314)  Acc@1: 96.8750 (88.9735)  Acc@5: 100.0000 (99.6973)LR: 1.075e-02
Train: 28 [ 390/390]  Loss: 0.2346 (0.315)  Acc@1: 92.5000 (88.9640)  Acc@5: 100.0000 (99.6840)LR: 1.075e-02
train_acc 88.964000
Valid: 28 [   0/390]  Loss: 0.3540 (0.354)  Acc@1: 89.0625 (89.0625)  Acc@5: 100.0000 (100.0000)
Valid: 28 [  50/390]  Loss: 0.4336 (0.414)  Acc@1: 82.8125 (87.1324)  Acc@5: 100.0000 (99.4179)
Valid: 28 [ 100/390]  Loss: 0.3503 (0.422)  Acc@1: 87.5000 (86.5099)  Acc@5: 100.0000 (99.3812)
Valid: 28 [ 150/390]  Loss: 0.3805 (0.435)  Acc@1: 85.9375 (85.9272)  Acc@5: 100.0000 (99.3481)
Valid: 28 [ 200/390]  Loss: 0.8084 (0.435)  Acc@1: 78.1250 (85.8831)  Acc@5: 96.8750 (99.3937)
Valid: 28 [ 250/390]  Loss: 0.3143 (0.436)  Acc@1: 90.6250 (85.9811)  Acc@5: 100.0000 (99.3837)
Valid: 28 [ 300/390]  Loss: 0.4646 (0.434)  Acc@1: 79.6875 (85.8181)  Acc@5: 98.4375 (99.3823)
Valid: 28 [ 350/390]  Loss: 0.5596 (0.433)  Acc@1: 76.5625 (85.8485)  Acc@5: 98.4375 (99.3812)
Valid: 28 [ 390/390]  Loss: 0.3127 (0.431)  Acc@1: 87.5000 (85.8720)  Acc@5: 100.0000 (99.4080)
valid_acc 85.872000
epoch = 28   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1560, 0.0753, 0.0576, 0.1033, 0.1935, 0.1370, 0.1434, 0.1340],
        [0.1599, 0.0641, 0.0497, 0.0776, 0.2064, 0.1751, 0.1274, 0.1398],
        [0.1895, 0.0839, 0.0631, 0.1123, 0.1849, 0.1306, 0.1052, 0.1304],
        [0.1873, 0.0749, 0.0611, 0.1007, 0.1595, 0.1536, 0.1323, 0.1306],
        [0.2086, 0.0586, 0.0489, 0.0915, 0.1472, 0.1592, 0.1471, 0.1389],
        [0.2362, 0.0820, 0.0647, 0.1168, 0.1252, 0.1191, 0.1257, 0.1303],
        [0.2420, 0.0704, 0.0603, 0.0972, 0.1701, 0.1275, 0.1143, 0.1183],
        [0.2982, 0.0537, 0.0484, 0.0939, 0.1310, 0.1218, 0.1199, 0.1333],
        [0.3854, 0.0447, 0.0424, 0.0682, 0.1019, 0.1362, 0.1148, 0.1064],
        [0.2731, 0.0756, 0.0632, 0.1066, 0.1195, 0.1168, 0.1173, 0.1280],
        [0.3493, 0.0621, 0.0532, 0.0814, 0.1392, 0.1092, 0.1045, 0.1012],
        [0.3090, 0.0494, 0.0441, 0.0791, 0.1334, 0.1164, 0.1418, 0.1268],
        [0.3708, 0.0419, 0.0396, 0.0592, 0.1115, 0.1082, 0.1396, 0.1293],
        [0.3931, 0.0360, 0.0345, 0.0425, 0.1110, 0.1164, 0.1231, 0.1433]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1134, 0.1380, 0.1079, 0.1159, 0.1426, 0.1499, 0.1149, 0.1173],
        [0.1222, 0.1153, 0.0969, 0.1363, 0.1463, 0.1230, 0.1391, 0.1208],
        [0.1101, 0.1374, 0.1094, 0.1375, 0.1368, 0.1150, 0.1337, 0.1201],
        [0.1246, 0.1214, 0.1074, 0.1295, 0.1385, 0.1180, 0.1351, 0.1254],
        [0.1226, 0.0927, 0.0835, 0.1264, 0.1498, 0.1399, 0.1386, 0.1465],
        [0.1150, 0.1469, 0.1215, 0.1214, 0.1206, 0.1239, 0.1351, 0.1157],
        [0.1324, 0.1318, 0.1203, 0.1162, 0.1332, 0.1235, 0.1234, 0.1192],
        [0.1318, 0.0917, 0.0899, 0.1448, 0.1478, 0.1374, 0.1252, 0.1314],
        [0.1458, 0.0883, 0.0878, 0.1403, 0.1309, 0.1410, 0.1323, 0.1337],
        [0.1136, 0.1342, 0.1115, 0.1317, 0.1309, 0.1473, 0.1074, 0.1234],
        [0.1223, 0.1191, 0.1081, 0.1250, 0.1220, 0.1369, 0.1291, 0.1374],
        [0.1301, 0.0903, 0.0825, 0.1329, 0.1452, 0.1548, 0.1160, 0.1481],
        [0.1367, 0.0826, 0.0795, 0.1251, 0.1459, 0.1475, 0.1344, 0.1483],
        [0.1546, 0.0759, 0.0728, 0.0996, 0.1468, 0.1376, 0.1495, 0.1632]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 29 [   0/390]  Loss: 0.4005 (0.401)  Acc@1: 85.9375 (85.9375)  Acc@5: 98.4375 (98.4375)LR: 1.002e-02
Train: 29 [  50/390]  Loss: 0.2732 (0.314)  Acc@1: 95.3125 (88.9400)  Acc@5: 100.0000 (99.6630)LR: 1.002e-02
Train: 29 [ 100/390]  Loss: 0.2824 (0.296)  Acc@1: 90.6250 (89.3719)  Acc@5: 100.0000 (99.7215)LR: 1.002e-02
Train: 29 [ 150/390]  Loss: 0.5231 (0.300)  Acc@1: 82.8125 (89.3108)  Acc@5: 98.4375 (99.6792)LR: 1.002e-02
Train: 29 [ 200/390]  Loss: 0.3010 (0.296)  Acc@1: 89.0625 (89.4123)  Acc@5: 100.0000 (99.6657)LR: 1.002e-02
Train: 29 [ 250/390]  Loss: 0.2071 (0.297)  Acc@1: 92.1875 (89.4236)  Acc@5: 100.0000 (99.6887)LR: 1.002e-02
Train: 29 [ 300/390]  Loss: 0.2113 (0.297)  Acc@1: 93.7500 (89.4570)  Acc@5: 100.0000 (99.6989)LR: 1.002e-02
Train: 29 [ 350/390]  Loss: 0.2314 (0.296)  Acc@1: 89.0625 (89.4854)  Acc@5: 100.0000 (99.6973)LR: 1.002e-02
Train: 29 [ 390/390]  Loss: 0.5444 (0.300)  Acc@1: 87.5000 (89.3600)  Acc@5: 97.5000 (99.7040)LR: 1.002e-02
train_acc 89.360000
Valid: 29 [   0/390]  Loss: 0.4009 (0.401)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)
Valid: 29 [  50/390]  Loss: 0.5520 (0.469)  Acc@1: 84.3750 (84.8958)  Acc@5: 98.4375 (99.2647)
Valid: 29 [ 100/390]  Loss: 0.3559 (0.444)  Acc@1: 90.6250 (85.8137)  Acc@5: 98.4375 (99.2729)
Valid: 29 [ 150/390]  Loss: 0.3200 (0.431)  Acc@1: 85.9375 (86.3204)  Acc@5: 100.0000 (99.3171)
Valid: 29 [ 200/390]  Loss: 0.2958 (0.417)  Acc@1: 89.0625 (86.5749)  Acc@5: 98.4375 (99.3548)
Valid: 29 [ 250/390]  Loss: 0.4638 (0.419)  Acc@1: 82.8125 (86.4106)  Acc@5: 98.4375 (99.3775)
Valid: 29 [ 300/390]  Loss: 0.4993 (0.425)  Acc@1: 84.3750 (86.2490)  Acc@5: 100.0000 (99.4134)
Valid: 29 [ 350/390]  Loss: 0.4736 (0.421)  Acc@1: 84.3750 (86.3871)  Acc@5: 98.4375 (99.3946)
Valid: 29 [ 390/390]  Loss: 0.6962 (0.423)  Acc@1: 77.5000 (86.3560)  Acc@5: 97.5000 (99.3640)
valid_acc 86.356000
epoch = 29   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_3x3', 2), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1565, 0.0735, 0.0570, 0.1037, 0.1958, 0.1365, 0.1441, 0.1330],
        [0.1625, 0.0624, 0.0487, 0.0765, 0.2074, 0.1760, 0.1261, 0.1404],
        [0.1923, 0.0826, 0.0629, 0.1139, 0.1833, 0.1302, 0.1052, 0.1297],
        [0.1908, 0.0739, 0.0606, 0.1010, 0.1600, 0.1527, 0.1309, 0.1301],
        [0.2138, 0.0572, 0.0483, 0.0918, 0.1470, 0.1575, 0.1471, 0.1373],
        [0.2412, 0.0802, 0.0639, 0.1175, 0.1223, 0.1193, 0.1262, 0.1295],
        [0.2483, 0.0689, 0.0592, 0.0964, 0.1699, 0.1264, 0.1132, 0.1178],
        [0.3070, 0.0520, 0.0472, 0.0930, 0.1297, 0.1202, 0.1192, 0.1316],
        [0.3984, 0.0428, 0.0407, 0.0659, 0.0992, 0.1360, 0.1126, 0.1042],
        [0.2803, 0.0735, 0.0621, 0.1064, 0.1172, 0.1154, 0.1165, 0.1285],
        [0.3630, 0.0602, 0.0517, 0.0798, 0.1372, 0.1062, 0.1025, 0.0993],
        [0.3195, 0.0478, 0.0431, 0.0781, 0.1310, 0.1144, 0.1414, 0.1247],
        [0.3849, 0.0401, 0.0382, 0.0574, 0.1091, 0.1063, 0.1375, 0.1264],
        [0.4072, 0.0346, 0.0334, 0.0413, 0.1087, 0.1149, 0.1191, 0.1409]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1136, 0.1387, 0.1075, 0.1153, 0.1421, 0.1499, 0.1162, 0.1168],
        [0.1215, 0.1146, 0.0963, 0.1370, 0.1474, 0.1242, 0.1381, 0.1208],
        [0.1107, 0.1378, 0.1090, 0.1372, 0.1362, 0.1154, 0.1346, 0.1192],
        [0.1230, 0.1206, 0.1066, 0.1302, 0.1410, 0.1184, 0.1357, 0.1244],
        [0.1231, 0.0922, 0.0839, 0.1275, 0.1499, 0.1381, 0.1393, 0.1461],
        [0.1154, 0.1476, 0.1216, 0.1216, 0.1201, 0.1228, 0.1350, 0.1158],
        [0.1320, 0.1316, 0.1207, 0.1159, 0.1326, 0.1232, 0.1245, 0.1195],
        [0.1308, 0.0907, 0.0905, 0.1462, 0.1483, 0.1387, 0.1246, 0.1302],
        [0.1449, 0.0868, 0.0876, 0.1411, 0.1302, 0.1423, 0.1322, 0.1349],
        [0.1137, 0.1339, 0.1107, 0.1315, 0.1307, 0.1493, 0.1074, 0.1228],
        [0.1215, 0.1183, 0.1079, 0.1260, 0.1242, 0.1364, 0.1287, 0.1370],
        [0.1304, 0.0893, 0.0823, 0.1328, 0.1457, 0.1562, 0.1154, 0.1480],
        [0.1377, 0.0808, 0.0790, 0.1247, 0.1477, 0.1473, 0.1343, 0.1485],
        [0.1567, 0.0746, 0.0724, 0.0992, 0.1480, 0.1370, 0.1482, 0.1640]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 30 [   0/390]  Loss: 0.3305 (0.330)  Acc@1: 89.0625 (89.0625)  Acc@5: 100.0000 (100.0000)LR: 9.292e-03
Train: 30 [  50/390]  Loss: 0.2431 (0.277)  Acc@1: 90.6250 (90.6863)  Acc@5: 100.0000 (99.8775)LR: 9.292e-03
Train: 30 [ 100/390]  Loss: 0.1177 (0.285)  Acc@1: 96.8750 (90.1454)  Acc@5: 100.0000 (99.8762)LR: 9.292e-03
Train: 30 [ 150/390]  Loss: 0.1219 (0.292)  Acc@1: 96.8750 (89.9317)  Acc@5: 100.0000 (99.8137)LR: 9.292e-03
Train: 30 [ 200/390]  Loss: 0.1945 (0.290)  Acc@1: 93.7500 (90.0575)  Acc@5: 100.0000 (99.7901)LR: 9.292e-03
Train: 30 [ 250/390]  Loss: 0.3244 (0.288)  Acc@1: 84.3750 (89.9900)  Acc@5: 100.0000 (99.7883)LR: 9.292e-03
Train: 30 [ 300/390]  Loss: 0.3737 (0.292)  Acc@1: 90.6250 (89.9346)  Acc@5: 100.0000 (99.7716)LR: 9.292e-03
Train: 30 [ 350/390]  Loss: 0.3138 (0.292)  Acc@1: 92.1875 (89.8326)  Acc@5: 100.0000 (99.7507)LR: 9.292e-03
Train: 30 [ 390/390]  Loss: 0.1210 (0.294)  Acc@1: 95.0000 (89.7400)  Acc@5: 100.0000 (99.7520)LR: 9.292e-03
train_acc 89.740000
Valid: 30 [   0/390]  Loss: 0.4132 (0.413)  Acc@1: 89.0625 (89.0625)  Acc@5: 100.0000 (100.0000)
Valid: 30 [  50/390]  Loss: 0.1918 (0.426)  Acc@1: 93.7500 (85.7843)  Acc@5: 100.0000 (99.4485)
Valid: 30 [ 100/390]  Loss: 0.5843 (0.424)  Acc@1: 87.5000 (86.2778)  Acc@5: 98.4375 (99.3348)
Valid: 30 [ 150/390]  Loss: 0.3060 (0.419)  Acc@1: 87.5000 (86.2272)  Acc@5: 98.4375 (99.3895)
Valid: 30 [ 200/390]  Loss: 0.1675 (0.420)  Acc@1: 92.1875 (86.2174)  Acc@5: 98.4375 (99.3781)
Valid: 30 [ 250/390]  Loss: 0.4470 (0.422)  Acc@1: 82.8125 (86.0745)  Acc@5: 98.4375 (99.4024)
Valid: 30 [ 300/390]  Loss: 0.4917 (0.416)  Acc@1: 87.5000 (86.2438)  Acc@5: 98.4375 (99.4290)
Valid: 30 [ 350/390]  Loss: 0.3382 (0.412)  Acc@1: 92.1875 (86.3827)  Acc@5: 100.0000 (99.4658)
Valid: 30 [ 390/390]  Loss: 0.2798 (0.412)  Acc@1: 97.5000 (86.3880)  Acc@5: 100.0000 (99.4520)
valid_acc 86.388000
epoch = 30   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_3x3', 2), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1569, 0.0721, 0.0564, 0.1043, 0.1978, 0.1347, 0.1452, 0.1326],
        [0.1651, 0.0607, 0.0478, 0.0760, 0.2097, 0.1746, 0.1253, 0.1407],
        [0.1958, 0.0815, 0.0622, 0.1149, 0.1853, 0.1277, 0.1042, 0.1285],
        [0.1940, 0.0724, 0.0600, 0.1009, 0.1614, 0.1516, 0.1301, 0.1296],
        [0.2200, 0.0557, 0.0474, 0.0912, 0.1441, 0.1568, 0.1480, 0.1368],
        [0.2474, 0.0791, 0.0633, 0.1189, 0.1203, 0.1177, 0.1253, 0.1280],
        [0.2576, 0.0678, 0.0585, 0.0966, 0.1685, 0.1243, 0.1113, 0.1154],
        [0.3175, 0.0506, 0.0461, 0.0921, 0.1276, 0.1193, 0.1167, 0.1301],
        [0.4152, 0.0413, 0.0394, 0.0641, 0.0952, 0.1342, 0.1097, 0.1008],
        [0.2870, 0.0719, 0.0612, 0.1064, 0.1156, 0.1141, 0.1163, 0.1276],
        [0.3777, 0.0584, 0.0507, 0.0787, 0.1333, 0.1034, 0.1007, 0.0972],
        [0.3323, 0.0465, 0.0423, 0.0777, 0.1290, 0.1114, 0.1389, 0.1218],
        [0.4028, 0.0389, 0.0371, 0.0561, 0.1059, 0.1036, 0.1329, 0.1226],
        [0.4220, 0.0332, 0.0322, 0.0400, 0.1045, 0.1132, 0.1165, 0.1383]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1130, 0.1384, 0.1078, 0.1159, 0.1408, 0.1512, 0.1168, 0.1162],
        [0.1215, 0.1135, 0.0955, 0.1363, 0.1495, 0.1250, 0.1374, 0.1212],
        [0.1109, 0.1373, 0.1088, 0.1371, 0.1361, 0.1150, 0.1351, 0.1197],
        [0.1225, 0.1198, 0.1062, 0.1308, 0.1420, 0.1182, 0.1358, 0.1246],
        [0.1229, 0.0905, 0.0839, 0.1281, 0.1502, 0.1383, 0.1409, 0.1450],
        [0.1147, 0.1470, 0.1222, 0.1218, 0.1208, 0.1231, 0.1351, 0.1154],
        [0.1320, 0.1313, 0.1205, 0.1169, 0.1326, 0.1227, 0.1248, 0.1192],
        [0.1298, 0.0895, 0.0909, 0.1479, 0.1497, 0.1392, 0.1242, 0.1288],
        [0.1449, 0.0850, 0.0871, 0.1417, 0.1299, 0.1439, 0.1316, 0.1359],
        [0.1147, 0.1328, 0.1109, 0.1313, 0.1310, 0.1484, 0.1071, 0.1239],
        [0.1209, 0.1173, 0.1073, 0.1261, 0.1261, 0.1370, 0.1278, 0.1374],
        [0.1303, 0.0874, 0.0820, 0.1337, 0.1467, 0.1556, 0.1156, 0.1487],
        [0.1390, 0.0791, 0.0780, 0.1245, 0.1499, 0.1465, 0.1343, 0.1489],
        [0.1575, 0.0729, 0.0717, 0.0989, 0.1501, 0.1382, 0.1468, 0.1639]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 31 [   0/390]  Loss: 0.1745 (0.174)  Acc@1: 95.3125 (95.3125)  Acc@5: 100.0000 (100.0000)LR: 8.583e-03
Train: 31 [  50/390]  Loss: 0.3195 (0.255)  Acc@1: 90.6250 (90.9007)  Acc@5: 98.4375 (99.8468)LR: 8.583e-03
Train: 31 [ 100/390]  Loss: 0.2170 (0.266)  Acc@1: 90.6250 (90.4703)  Acc@5: 100.0000 (99.7370)LR: 8.583e-03
Train: 31 [ 150/390]  Loss: 0.3682 (0.272)  Acc@1: 87.5000 (90.3974)  Acc@5: 100.0000 (99.7413)LR: 8.583e-03
Train: 31 [ 200/390]  Loss: 0.2155 (0.273)  Acc@1: 93.7500 (90.3918)  Acc@5: 100.0000 (99.7746)LR: 8.583e-03
Train: 31 [ 250/390]  Loss: 0.1999 (0.274)  Acc@1: 92.1875 (90.3635)  Acc@5: 100.0000 (99.7510)LR: 8.583e-03
Train: 31 [ 300/390]  Loss: 0.2711 (0.275)  Acc@1: 89.0625 (90.4485)  Acc@5: 100.0000 (99.7456)LR: 8.583e-03
Train: 31 [ 350/390]  Loss: 0.4011 (0.278)  Acc@1: 87.5000 (90.3446)  Acc@5: 100.0000 (99.7596)LR: 8.583e-03
Train: 31 [ 390/390]  Loss: 0.3761 (0.283)  Acc@1: 87.5000 (90.1520)  Acc@5: 100.0000 (99.7480)LR: 8.583e-03
train_acc 90.152000
Valid: 31 [   0/390]  Loss: 0.1931 (0.193)  Acc@1: 95.3125 (95.3125)  Acc@5: 100.0000 (100.0000)
Valid: 31 [  50/390]  Loss: 0.3327 (0.391)  Acc@1: 90.6250 (86.7034)  Acc@5: 98.4375 (99.4792)
Valid: 31 [ 100/390]  Loss: 0.4654 (0.402)  Acc@1: 84.3750 (86.3552)  Acc@5: 100.0000 (99.4895)
Valid: 31 [ 150/390]  Loss: 0.3374 (0.391)  Acc@1: 87.5000 (86.9412)  Acc@5: 98.4375 (99.4619)
Valid: 31 [ 200/390]  Loss: 0.3708 (0.394)  Acc@1: 87.5000 (86.9714)  Acc@5: 100.0000 (99.4636)
Valid: 31 [ 250/390]  Loss: 0.4384 (0.403)  Acc@1: 82.8125 (86.7654)  Acc@5: 100.0000 (99.4397)
Valid: 31 [ 300/390]  Loss: 0.4248 (0.398)  Acc@1: 84.3750 (86.7733)  Acc@5: 100.0000 (99.4861)
Valid: 31 [ 350/390]  Loss: 0.5418 (0.395)  Acc@1: 84.3750 (86.9079)  Acc@5: 96.8750 (99.4970)
Valid: 31 [ 390/390]  Loss: 0.2008 (0.393)  Acc@1: 95.0000 (87.0080)  Acc@5: 100.0000 (99.5000)
valid_acc 87.008000
epoch = 31   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_3x3', 2), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('max_pool_3x3', 0), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1579, 0.0708, 0.0557, 0.1050, 0.1981, 0.1349, 0.1455, 0.1321],
        [0.1678, 0.0589, 0.0466, 0.0748, 0.2111, 0.1751, 0.1246, 0.1411],
        [0.1979, 0.0803, 0.0616, 0.1159, 0.1865, 0.1272, 0.1035, 0.1272],
        [0.1971, 0.0706, 0.0588, 0.0999, 0.1631, 0.1514, 0.1298, 0.1293],
        [0.2259, 0.0543, 0.0464, 0.0907, 0.1423, 0.1555, 0.1483, 0.1366],
        [0.2530, 0.0781, 0.0631, 0.1206, 0.1181, 0.1169, 0.1245, 0.1257],
        [0.2691, 0.0659, 0.0571, 0.0953, 0.1661, 0.1226, 0.1100, 0.1137],
        [0.3289, 0.0491, 0.0452, 0.0913, 0.1235, 0.1180, 0.1146, 0.1295],
        [0.4311, 0.0397, 0.0382, 0.0622, 0.0932, 0.1312, 0.1061, 0.0983],
        [0.2934, 0.0703, 0.0602, 0.1064, 0.1134, 0.1133, 0.1158, 0.1272],
        [0.3955, 0.0562, 0.0491, 0.0770, 0.1298, 0.1000, 0.0972, 0.0952],
        [0.3455, 0.0449, 0.0411, 0.0768, 0.1261, 0.1092, 0.1361, 0.1204],
        [0.4199, 0.0374, 0.0360, 0.0549, 0.1020, 0.1010, 0.1293, 0.1195],
        [0.4341, 0.0317, 0.0310, 0.0386, 0.1021, 0.1130, 0.1135, 0.1359]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1127, 0.1373, 0.1075, 0.1161, 0.1410, 0.1517, 0.1173, 0.1164],
        [0.1214, 0.1117, 0.0942, 0.1370, 0.1512, 0.1269, 0.1363, 0.1214],
        [0.1108, 0.1350, 0.1075, 0.1377, 0.1376, 0.1150, 0.1352, 0.1211],
        [0.1223, 0.1184, 0.1053, 0.1308, 0.1416, 0.1177, 0.1383, 0.1255],
        [0.1229, 0.0890, 0.0828, 0.1276, 0.1512, 0.1402, 0.1395, 0.1468],
        [0.1151, 0.1456, 0.1217, 0.1223, 0.1225, 0.1224, 0.1353, 0.1151],
        [0.1328, 0.1303, 0.1200, 0.1164, 0.1335, 0.1233, 0.1251, 0.1186],
        [0.1292, 0.0878, 0.0900, 0.1484, 0.1497, 0.1404, 0.1253, 0.1291],
        [0.1436, 0.0834, 0.0862, 0.1424, 0.1304, 0.1451, 0.1318, 0.1371],
        [0.1153, 0.1318, 0.1107, 0.1304, 0.1323, 0.1496, 0.1061, 0.1237],
        [0.1216, 0.1165, 0.1066, 0.1255, 0.1267, 0.1370, 0.1273, 0.1388],
        [0.1303, 0.0860, 0.0809, 0.1336, 0.1476, 0.1558, 0.1170, 0.1487],
        [0.1378, 0.0779, 0.0772, 0.1246, 0.1520, 0.1474, 0.1347, 0.1485],
        [0.1572, 0.0717, 0.0708, 0.0984, 0.1515, 0.1381, 0.1472, 0.1650]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 32 [   0/390]  Loss: 0.2714 (0.271)  Acc@1: 93.7500 (93.7500)  Acc@5: 100.0000 (100.0000)LR: 7.891e-03
Train: 32 [  50/390]  Loss: 0.4329 (0.264)  Acc@1: 87.5000 (90.9314)  Acc@5: 100.0000 (99.9387)LR: 7.891e-03
Train: 32 [ 100/390]  Loss: 0.1823 (0.262)  Acc@1: 92.1875 (90.9808)  Acc@5: 100.0000 (99.8608)LR: 7.891e-03
Train: 32 [ 150/390]  Loss: 0.3608 (0.267)  Acc@1: 89.0625 (90.8320)  Acc@5: 100.0000 (99.7620)LR: 7.891e-03
Train: 32 [ 200/390]  Loss: 0.1184 (0.270)  Acc@1: 98.4375 (90.6250)  Acc@5: 100.0000 (99.7746)LR: 7.891e-03
Train: 32 [ 250/390]  Loss: 0.3000 (0.277)  Acc@1: 84.3750 (90.3698)  Acc@5: 100.0000 (99.7634)LR: 7.891e-03
Train: 32 [ 300/390]  Loss: 0.1888 (0.279)  Acc@1: 90.6250 (90.3032)  Acc@5: 100.0000 (99.7768)LR: 7.891e-03
Train: 32 [ 350/390]  Loss: 0.2705 (0.278)  Acc@1: 92.1875 (90.3579)  Acc@5: 100.0000 (99.7819)LR: 7.891e-03
Train: 32 [ 390/390]  Loss: 0.1030 (0.277)  Acc@1: 97.5000 (90.3680)  Acc@5: 100.0000 (99.7920)LR: 7.891e-03
train_acc 90.368000
Valid: 32 [   0/390]  Loss: 0.2938 (0.294)  Acc@1: 87.5000 (87.5000)  Acc@5: 98.4375 (98.4375)
Valid: 32 [  50/390]  Loss: 0.3776 (0.409)  Acc@1: 85.9375 (86.8260)  Acc@5: 100.0000 (99.4179)
Valid: 32 [ 100/390]  Loss: 0.2710 (0.416)  Acc@1: 93.7500 (86.1541)  Acc@5: 98.4375 (99.4585)
Valid: 32 [ 150/390]  Loss: 0.5358 (0.410)  Acc@1: 82.8125 (86.4031)  Acc@5: 98.4375 (99.4412)
Valid: 32 [ 200/390]  Loss: 0.2291 (0.401)  Acc@1: 92.1875 (86.5827)  Acc@5: 100.0000 (99.4325)
Valid: 32 [ 250/390]  Loss: 0.4162 (0.403)  Acc@1: 82.8125 (86.5040)  Acc@5: 98.4375 (99.4086)
Valid: 32 [ 300/390]  Loss: 0.5082 (0.397)  Acc@1: 82.8125 (86.7733)  Acc@5: 96.8750 (99.4186)
Valid: 32 [ 350/390]  Loss: 0.3050 (0.403)  Acc@1: 89.0625 (86.6453)  Acc@5: 100.0000 (99.3679)
valid_acc 86.604000
epoch = 32   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('sep_conv_5x5', 3), ('dil_conv_3x3', 2), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('skip_connect', 2), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1587, 0.0689, 0.0549, 0.1047, 0.1998, 0.1359, 0.1450, 0.1321],
        [0.1695, 0.0572, 0.0453, 0.0731, 0.2145, 0.1757, 0.1242, 0.1405],
        [0.2010, 0.0787, 0.0610, 0.1163, 0.1867, 0.1270, 0.1031, 0.1262],
        [0.1997, 0.0695, 0.0582, 0.0997, 0.1634, 0.1514, 0.1299, 0.1280],
        [0.2296, 0.0530, 0.0457, 0.0904, 0.1426, 0.1539, 0.1476, 0.1372],
        [0.2598, 0.0763, 0.0626, 0.1215, 0.1168, 0.1154, 0.1240, 0.1235],
        [0.2764, 0.0643, 0.0559, 0.0938, 0.1660, 0.1214, 0.1105, 0.1117],
        [0.3375, 0.0479, 0.0442, 0.0902, 0.1214, 0.1176, 0.1130, 0.1282],
        [0.4465, 0.0382, 0.0371, 0.0605, 0.0911, 0.1282, 0.1027, 0.0957],
        [0.3019, 0.0683, 0.0593, 0.1059, 0.1114, 0.1124, 0.1153, 0.1255],
        [0.4080, 0.0546, 0.0479, 0.0756, 0.1264, 0.0980, 0.0958, 0.0936],
        [0.3580, 0.0438, 0.0403, 0.0759, 0.1231, 0.1059, 0.1339, 0.1190],
        [0.4360, 0.0361, 0.0351, 0.0537, 0.0991, 0.0979, 0.1258, 0.1163],
        [0.4509, 0.0307, 0.0302, 0.0377, 0.0989, 0.1094, 0.1086, 0.1338]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1114, 0.1375, 0.1074, 0.1164, 0.1407, 0.1524, 0.1178, 0.1163],
        [0.1224, 0.1110, 0.0938, 0.1368, 0.1520, 0.1274, 0.1360, 0.1205],
        [0.1097, 0.1349, 0.1073, 0.1368, 0.1379, 0.1151, 0.1372, 0.1210],
        [0.1227, 0.1177, 0.1054, 0.1318, 0.1427, 0.1165, 0.1378, 0.1253],
        [0.1230, 0.0879, 0.0829, 0.1279, 0.1524, 0.1392, 0.1400, 0.1467],
        [0.1136, 0.1459, 0.1221, 0.1229, 0.1240, 0.1218, 0.1350, 0.1148],
        [0.1340, 0.1299, 0.1205, 0.1159, 0.1346, 0.1241, 0.1237, 0.1174],
        [0.1293, 0.0869, 0.0906, 0.1497, 0.1493, 0.1405, 0.1248, 0.1289],
        [0.1429, 0.0826, 0.0861, 0.1427, 0.1301, 0.1464, 0.1322, 0.1370],
        [0.1159, 0.1314, 0.1101, 0.1293, 0.1329, 0.1516, 0.1052, 0.1236],
        [0.1219, 0.1162, 0.1072, 0.1250, 0.1266, 0.1364, 0.1280, 0.1387],
        [0.1309, 0.0847, 0.0808, 0.1339, 0.1477, 0.1550, 0.1185, 0.1484],
        [0.1379, 0.0768, 0.0769, 0.1241, 0.1531, 0.1483, 0.1353, 0.1476],
        [0.1579, 0.0706, 0.0706, 0.0980, 0.1522, 0.1391, 0.1469, 0.1647]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Valid: 32 [ 390/390]  Loss: 0.3620 (0.403)  Acc@1: 87.5000 (86.6040)  Acc@5: 100.0000 (99.3960)
Train: 33 [   0/390]  Loss: 0.4820 (0.482)  Acc@1: 84.3750 (84.3750)  Acc@5: 100.0000 (100.0000)LR: 7.219e-03
Train: 33 [  50/390]  Loss: 0.2318 (0.260)  Acc@1: 92.1875 (91.2377)  Acc@5: 98.4375 (99.7243)LR: 7.219e-03
Train: 33 [ 100/390]  Loss: 0.2646 (0.254)  Acc@1: 95.3125 (91.5996)  Acc@5: 100.0000 (99.7834)LR: 7.219e-03
Train: 33 [ 150/390]  Loss: 0.1947 (0.262)  Acc@1: 90.6250 (91.1320)  Acc@5: 100.0000 (99.7724)LR: 7.219e-03
Train: 33 [ 200/390]  Loss: 0.4604 (0.260)  Acc@1: 82.8125 (91.1225)  Acc@5: 100.0000 (99.7901)LR: 7.219e-03
Train: 33 [ 250/390]  Loss: 0.2782 (0.263)  Acc@1: 87.5000 (90.9612)  Acc@5: 100.0000 (99.8008)LR: 7.219e-03
Train: 33 [ 300/390]  Loss: 0.4672 (0.264)  Acc@1: 82.8125 (90.7963)  Acc@5: 100.0000 (99.8235)LR: 7.219e-03
Train: 33 [ 350/390]  Loss: 0.5737 (0.268)  Acc@1: 76.5625 (90.6562)  Acc@5: 100.0000 (99.7908)LR: 7.219e-03
Train: 33 [ 390/390]  Loss: 0.1790 (0.270)  Acc@1: 92.5000 (90.6400)  Acc@5: 100.0000 (99.7920)LR: 7.219e-03
train_acc 90.640000
Valid: 33 [   0/390]  Loss: 0.4139 (0.414)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)
Valid: 33 [  50/390]  Loss: 0.6300 (0.369)  Acc@1: 79.6875 (87.3162)  Acc@5: 98.4375 (99.6017)
Valid: 33 [ 100/390]  Loss: 0.2990 (0.380)  Acc@1: 89.0625 (87.3453)  Acc@5: 100.0000 (99.4431)
Valid: 33 [ 150/390]  Loss: 0.2748 (0.377)  Acc@1: 93.7500 (87.5517)  Acc@5: 98.4375 (99.4619)
Valid: 33 [ 200/390]  Loss: 0.2280 (0.383)  Acc@1: 93.7500 (87.4300)  Acc@5: 100.0000 (99.4792)
Valid: 33 [ 250/390]  Loss: 0.4454 (0.381)  Acc@1: 85.9375 (87.4253)  Acc@5: 98.4375 (99.4771)
Valid: 33 [ 300/390]  Loss: 0.3903 (0.382)  Acc@1: 85.9375 (87.3806)  Acc@5: 100.0000 (99.5017)
Valid: 33 [ 350/390]  Loss: 0.2653 (0.383)  Acc@1: 93.7500 (87.3665)  Acc@5: 100.0000 (99.5014)
Valid: 33 [ 390/390]  Loss: 0.5630 (0.382)  Acc@1: 85.0000 (87.3880)  Acc@5: 100.0000 (99.5160)
valid_acc 87.388000
epoch = 33   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1591, 0.0675, 0.0542, 0.1047, 0.2002, 0.1358, 0.1458, 0.1328],
        [0.1732, 0.0558, 0.0442, 0.0719, 0.2153, 0.1773, 0.1224, 0.1399],
        [0.2040, 0.0770, 0.0601, 0.1162, 0.1876, 0.1278, 0.1024, 0.1249],
        [0.2037, 0.0676, 0.0568, 0.0981, 0.1655, 0.1515, 0.1292, 0.1277],
        [0.2334, 0.0515, 0.0446, 0.0889, 0.1425, 0.1529, 0.1490, 0.1372],
        [0.2657, 0.0751, 0.0620, 0.1222, 0.1158, 0.1137, 0.1234, 0.1220],
        [0.2878, 0.0626, 0.0547, 0.0930, 0.1641, 0.1191, 0.1082, 0.1106],
        [0.3461, 0.0468, 0.0433, 0.0887, 0.1201, 0.1164, 0.1120, 0.1266],
        [0.4604, 0.0373, 0.0364, 0.0593, 0.0885, 0.1254, 0.0995, 0.0933],
        [0.3110, 0.0665, 0.0585, 0.1056, 0.1092, 0.1115, 0.1138, 0.1238],
        [0.4212, 0.0529, 0.0468, 0.0744, 0.1230, 0.0971, 0.0936, 0.0910],
        [0.3688, 0.0427, 0.0396, 0.0750, 0.1211, 0.1033, 0.1318, 0.1177],
        [0.4504, 0.0351, 0.0344, 0.0527, 0.0962, 0.0945, 0.1223, 0.1144],
        [0.4658, 0.0296, 0.0294, 0.0368, 0.0955, 0.1066, 0.1042, 0.1321]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1119, 0.1377, 0.1068, 0.1158, 0.1403, 0.1525, 0.1179, 0.1170],
        [0.1216, 0.1095, 0.0922, 0.1366, 0.1538, 0.1292, 0.1358, 0.1211],
        [0.1100, 0.1348, 0.1064, 0.1363, 0.1389, 0.1144, 0.1391, 0.1202],
        [0.1229, 0.1164, 0.1042, 0.1325, 0.1435, 0.1169, 0.1380, 0.1256],
        [0.1225, 0.0882, 0.0831, 0.1279, 0.1531, 0.1395, 0.1393, 0.1464],
        [0.1142, 0.1453, 0.1213, 0.1224, 0.1249, 0.1225, 0.1350, 0.1144],
        [0.1346, 0.1285, 0.1193, 0.1155, 0.1353, 0.1258, 0.1240, 0.1169],
        [0.1292, 0.0866, 0.0903, 0.1497, 0.1505, 0.1399, 0.1251, 0.1287],
        [0.1423, 0.0815, 0.0852, 0.1414, 0.1308, 0.1483, 0.1326, 0.1379],
        [0.1163, 0.1301, 0.1089, 0.1294, 0.1341, 0.1520, 0.1044, 0.1248],
        [0.1212, 0.1152, 0.1068, 0.1260, 0.1278, 0.1363, 0.1284, 0.1384],
        [0.1307, 0.0836, 0.0802, 0.1332, 0.1488, 0.1549, 0.1188, 0.1499],
        [0.1376, 0.0754, 0.0760, 0.1230, 0.1540, 0.1495, 0.1360, 0.1484],
        [0.1580, 0.0690, 0.0701, 0.0973, 0.1528, 0.1399, 0.1468, 0.1660]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 34 [   0/390]  Loss: 0.2126 (0.213)  Acc@1: 95.3125 (95.3125)  Acc@5: 100.0000 (100.0000)LR: 6.570e-03
Train: 34 [  50/390]  Loss: 0.2478 (0.267)  Acc@1: 90.6250 (90.8088)  Acc@5: 100.0000 (99.7855)LR: 6.570e-03
Train: 34 [ 100/390]  Loss: 0.2847 (0.265)  Acc@1: 90.6250 (90.7178)  Acc@5: 100.0000 (99.8608)LR: 6.570e-03
Train: 34 [ 150/390]  Loss: 0.4378 (0.272)  Acc@1: 81.2500 (90.5940)  Acc@5: 100.0000 (99.8551)LR: 6.570e-03
Train: 34 [ 200/390]  Loss: 0.2262 (0.270)  Acc@1: 89.0625 (90.5473)  Acc@5: 100.0000 (99.8368)LR: 6.570e-03
Train: 34 [ 250/390]  Loss: 0.2335 (0.267)  Acc@1: 90.6250 (90.6624)  Acc@5: 100.0000 (99.7883)LR: 6.570e-03
Train: 34 [ 300/390]  Loss: 0.2499 (0.263)  Acc@1: 92.1875 (90.8119)  Acc@5: 98.4375 (99.7924)LR: 6.570e-03
Train: 34 [ 350/390]  Loss: 0.2852 (0.266)  Acc@1: 96.8750 (90.7229)  Acc@5: 100.0000 (99.7952)LR: 6.570e-03
train_acc 90.800000
Train: 34 [ 390/390]  Loss: 0.3664 (0.266)  Acc@1: 82.5000 (90.8000)  Acc@5: 100.0000 (99.7960)LR: 6.570e-03
Valid: 34 [   0/390]  Loss: 0.4622 (0.462)  Acc@1: 87.5000 (87.5000)  Acc@5: 98.4375 (98.4375)
Valid: 34 [  50/390]  Loss: 0.2281 (0.414)  Acc@1: 90.6250 (87.0711)  Acc@5: 100.0000 (99.5711)
Valid: 34 [ 100/390]  Loss: 0.4710 (0.391)  Acc@1: 87.5000 (87.5155)  Acc@5: 100.0000 (99.5514)
Valid: 34 [ 150/390]  Loss: 0.5128 (0.392)  Acc@1: 79.6875 (87.3655)  Acc@5: 100.0000 (99.6068)
Valid: 34 [ 200/390]  Loss: 0.3791 (0.395)  Acc@1: 92.1875 (87.1735)  Acc@5: 98.4375 (99.6035)
Valid: 34 [ 250/390]  Loss: 0.3594 (0.394)  Acc@1: 87.5000 (87.2012)  Acc@5: 96.8750 (99.5705)
Valid: 34 [ 300/390]  Loss: 0.4293 (0.400)  Acc@1: 87.5000 (87.0795)  Acc@5: 100.0000 (99.5432)
Valid: 34 [ 350/390]  Loss: 0.4099 (0.396)  Acc@1: 85.9375 (87.1350)  Acc@5: 98.4375 (99.5326)
Valid: 34 [ 390/390]  Loss: 0.4888 (0.395)  Acc@1: 90.0000 (87.2320)  Acc@5: 97.5000 (99.5280)
valid_acc 87.232000
epoch = 34   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('dil_conv_3x3', 2), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1608, 0.0658, 0.0532, 0.1040, 0.2020, 0.1355, 0.1455, 0.1331],
        [0.1753, 0.0548, 0.0435, 0.0711, 0.2159, 0.1781, 0.1218, 0.1395],
        [0.2075, 0.0749, 0.0589, 0.1154, 0.1881, 0.1287, 0.1015, 0.1249],
        [0.2059, 0.0663, 0.0561, 0.0973, 0.1663, 0.1525, 0.1281, 0.1274],
        [0.2390, 0.0499, 0.0437, 0.0876, 0.1417, 0.1515, 0.1495, 0.1371],
        [0.2736, 0.0734, 0.0610, 0.1220, 0.1142, 0.1126, 0.1223, 0.1210],
        [0.2975, 0.0613, 0.0540, 0.0923, 0.1613, 0.1174, 0.1075, 0.1086],
        [0.3562, 0.0454, 0.0424, 0.0869, 0.1189, 0.1146, 0.1111, 0.1243],
        [0.4721, 0.0362, 0.0354, 0.0574, 0.0866, 0.1242, 0.0969, 0.0912],
        [0.3224, 0.0645, 0.0571, 0.1045, 0.1060, 0.1099, 0.1133, 0.1223],
        [0.4322, 0.0516, 0.0458, 0.0732, 0.1208, 0.0956, 0.0916, 0.0893],
        [0.3808, 0.0414, 0.0388, 0.0740, 0.1192, 0.1011, 0.1296, 0.1152],
        [0.4638, 0.0341, 0.0336, 0.0517, 0.0940, 0.0914, 0.1198, 0.1116],
        [0.4790, 0.0286, 0.0287, 0.0361, 0.0923, 0.1046, 0.1012, 0.1295]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1114, 0.1379, 0.1069, 0.1149, 0.1411, 0.1533, 0.1181, 0.1165],
        [0.1217, 0.1082, 0.0911, 0.1370, 0.1535, 0.1316, 0.1358, 0.1211],
        [0.1091, 0.1350, 0.1068, 0.1357, 0.1397, 0.1141, 0.1402, 0.1195],
        [0.1238, 0.1147, 0.1029, 0.1317, 0.1447, 0.1167, 0.1391, 0.1266],
        [0.1218, 0.0872, 0.0827, 0.1275, 0.1550, 0.1402, 0.1395, 0.1460],
        [0.1131, 0.1448, 0.1217, 0.1220, 0.1262, 0.1225, 0.1360, 0.1135],
        [0.1356, 0.1273, 0.1188, 0.1152, 0.1354, 0.1268, 0.1248, 0.1162],
        [0.1297, 0.0858, 0.0902, 0.1506, 0.1514, 0.1411, 0.1245, 0.1267],
        [0.1428, 0.0806, 0.0847, 0.1413, 0.1303, 0.1493, 0.1344, 0.1367],
        [0.1158, 0.1305, 0.1096, 0.1293, 0.1338, 0.1525, 0.1041, 0.1244],
        [0.1212, 0.1137, 0.1058, 0.1261, 0.1285, 0.1367, 0.1296, 0.1382],
        [0.1305, 0.0828, 0.0800, 0.1331, 0.1478, 0.1567, 0.1183, 0.1507],
        [0.1377, 0.0749, 0.0765, 0.1240, 0.1534, 0.1491, 0.1370, 0.1473],
        [0.1592, 0.0685, 0.0704, 0.0975, 0.1540, 0.1398, 0.1464, 0.1641]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 35 [   0/390]  Loss: 0.1876 (0.188)  Acc@1: 95.3125 (95.3125)  Acc@5: 100.0000 (100.0000)LR: 5.947e-03
Train: 35 [  50/390]  Loss: 0.2059 (0.232)  Acc@1: 93.7500 (92.1262)  Acc@5: 98.4375 (99.8468)LR: 5.947e-03
Train: 35 [ 100/390]  Loss: 0.3307 (0.244)  Acc@1: 93.7500 (91.5532)  Acc@5: 100.0000 (99.8762)LR: 5.947e-03
Train: 35 [ 150/390]  Loss: 0.1935 (0.251)  Acc@1: 90.6250 (91.1113)  Acc@5: 100.0000 (99.8965)LR: 5.947e-03
Train: 35 [ 200/390]  Loss: 0.2080 (0.258)  Acc@1: 90.6250 (90.9748)  Acc@5: 100.0000 (99.8368)LR: 5.947e-03
Train: 35 [ 250/390]  Loss: 0.2545 (0.257)  Acc@1: 93.7500 (91.1915)  Acc@5: 100.0000 (99.8008)LR: 5.947e-03
Train: 35 [ 300/390]  Loss: 0.1803 (0.253)  Acc@1: 93.7500 (91.2635)  Acc@5: 100.0000 (99.8131)LR: 5.947e-03
Train: 35 [ 350/390]  Loss: 0.2283 (0.256)  Acc@1: 92.1875 (91.1280)  Acc@5: 100.0000 (99.8175)LR: 5.947e-03
Train: 35 [ 390/390]  Loss: 0.2038 (0.254)  Acc@1: 92.5000 (91.1560)  Acc@5: 100.0000 (99.8240)LR: 5.947e-03
train_acc 91.156000
Valid: 35 [   0/390]  Loss: 0.2570 (0.257)  Acc@1: 90.6250 (90.6250)  Acc@5: 98.4375 (98.4375)
Valid: 35 [  50/390]  Loss: 0.3506 (0.401)  Acc@1: 87.5000 (87.0098)  Acc@5: 100.0000 (99.3566)
Valid: 35 [ 100/390]  Loss: 0.5608 (0.394)  Acc@1: 81.2500 (87.1287)  Acc@5: 100.0000 (99.4431)
Valid: 35 [ 150/390]  Loss: 0.5518 (0.402)  Acc@1: 84.3750 (87.0550)  Acc@5: 98.4375 (99.4309)
Valid: 35 [ 200/390]  Loss: 0.4794 (0.402)  Acc@1: 87.5000 (87.1502)  Acc@5: 98.4375 (99.4325)
Valid: 35 [ 250/390]  Loss: 0.7804 (0.397)  Acc@1: 73.4375 (87.1327)  Acc@5: 98.4375 (99.4397)
Valid: 35 [ 300/390]  Loss: 0.4064 (0.397)  Acc@1: 90.6250 (87.0172)  Acc@5: 100.0000 (99.4757)
Valid: 35 [ 350/390]  Loss: 0.5675 (0.400)  Acc@1: 82.8125 (86.9747)  Acc@5: 98.4375 (99.4569)
Valid: 35 [ 390/390]  Loss: 0.4611 (0.402)  Acc@1: 85.0000 (86.8720)  Acc@5: 100.0000 (99.4480)
valid_acc 86.872000
epoch = 35   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('sep_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1617, 0.0643, 0.0524, 0.1039, 0.2049, 0.1341, 0.1456, 0.1330],
        [0.1786, 0.0534, 0.0427, 0.0702, 0.2165, 0.1788, 0.1203, 0.1396],
        [0.2119, 0.0735, 0.0582, 0.1156, 0.1872, 0.1290, 0.0995, 0.1252],
        [0.2111, 0.0652, 0.0556, 0.0972, 0.1658, 0.1505, 0.1279, 0.1268],
        [0.2431, 0.0484, 0.0427, 0.0863, 0.1424, 0.1501, 0.1498, 0.1372],
        [0.2797, 0.0719, 0.0603, 0.1228, 0.1125, 0.1114, 0.1224, 0.1190],
        [0.3101, 0.0598, 0.0530, 0.0921, 0.1588, 0.1139, 0.1062, 0.1061],
        [0.3683, 0.0442, 0.0414, 0.0858, 0.1176, 0.1134, 0.1078, 0.1215],
        [0.4885, 0.0349, 0.0343, 0.0557, 0.0844, 0.1211, 0.0928, 0.0883],
        [0.3322, 0.0630, 0.0559, 0.1039, 0.1034, 0.1082, 0.1126, 0.1209],
        [0.4472, 0.0502, 0.0448, 0.0720, 0.1167, 0.0935, 0.0887, 0.0870],
        [0.3941, 0.0403, 0.0380, 0.0731, 0.1165, 0.0985, 0.1261, 0.1134],
        [0.4810, 0.0331, 0.0328, 0.0505, 0.0898, 0.0883, 0.1156, 0.1089],
        [0.4932, 0.0275, 0.0278, 0.0351, 0.0895, 0.1016, 0.0978, 0.1275]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1095, 0.1385, 0.1070, 0.1146, 0.1422, 0.1545, 0.1185, 0.1152],
        [0.1235, 0.1067, 0.0898, 0.1374, 0.1549, 0.1317, 0.1348, 0.1212],
        [0.1078, 0.1355, 0.1066, 0.1345, 0.1401, 0.1139, 0.1415, 0.1201],
        [0.1242, 0.1134, 0.1015, 0.1305, 0.1456, 0.1175, 0.1395, 0.1278],
        [0.1222, 0.0866, 0.0824, 0.1281, 0.1549, 0.1398, 0.1403, 0.1456],
        [0.1132, 0.1445, 0.1212, 0.1213, 0.1261, 0.1235, 0.1357, 0.1144],
        [0.1365, 0.1261, 0.1179, 0.1146, 0.1356, 0.1278, 0.1257, 0.1158],
        [0.1303, 0.0848, 0.0894, 0.1512, 0.1518, 0.1420, 0.1249, 0.1258],
        [0.1423, 0.0797, 0.0840, 0.1413, 0.1301, 0.1503, 0.1342, 0.1381],
        [0.1152, 0.1302, 0.1090, 0.1282, 0.1340, 0.1546, 0.1034, 0.1254],
        [0.1220, 0.1131, 0.1055, 0.1249, 0.1300, 0.1363, 0.1295, 0.1388],
        [0.1312, 0.0817, 0.0793, 0.1335, 0.1474, 0.1568, 0.1180, 0.1520],
        [0.1367, 0.0743, 0.0758, 0.1236, 0.1566, 0.1500, 0.1360, 0.1470],
        [0.1592, 0.0674, 0.0698, 0.0972, 0.1551, 0.1403, 0.1474, 0.1635]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 36 [   0/390]  Loss: 0.1977 (0.198)  Acc@1: 90.6250 (90.6250)  Acc@5: 100.0000 (100.0000)LR: 5.351e-03
Train: 36 [  50/390]  Loss: 0.1487 (0.252)  Acc@1: 95.3125 (91.3909)  Acc@5: 100.0000 (99.7855)LR: 5.351e-03
Train: 36 [ 100/390]  Loss: 0.1547 (0.247)  Acc@1: 95.3125 (91.3830)  Acc@5: 100.0000 (99.8298)LR: 5.351e-03
Train: 36 [ 150/390]  Loss: 0.2962 (0.253)  Acc@1: 87.5000 (90.9768)  Acc@5: 100.0000 (99.8137)LR: 5.351e-03
Train: 36 [ 200/390]  Loss: 0.1500 (0.255)  Acc@1: 96.8750 (90.8738)  Acc@5: 100.0000 (99.8212)LR: 5.351e-03
Train: 36 [ 250/390]  Loss: 0.2439 (0.257)  Acc@1: 90.6250 (90.7744)  Acc@5: 100.0000 (99.8319)LR: 5.351e-03
Train: 36 [ 300/390]  Loss: 0.1849 (0.260)  Acc@1: 93.7500 (90.6873)  Acc@5: 100.0000 (99.8287)LR: 5.351e-03
Train: 36 [ 350/390]  Loss: 0.2227 (0.259)  Acc@1: 89.0625 (90.8209)  Acc@5: 100.0000 (99.8353)LR: 5.351e-03
Train: 36 [ 390/390]  Loss: 0.1588 (0.261)  Acc@1: 92.5000 (90.7000)  Acc@5: 100.0000 (99.8360)LR: 5.351e-03
train_acc 90.700000
Valid: 36 [   0/390]  Loss: 0.3876 (0.388)  Acc@1: 90.6250 (90.6250)  Acc@5: 98.4375 (98.4375)
Valid: 36 [  50/390]  Loss: 0.3883 (0.369)  Acc@1: 82.8125 (87.6838)  Acc@5: 98.4375 (99.4179)
Valid: 36 [ 100/390]  Loss: 0.5600 (0.366)  Acc@1: 81.2500 (87.6083)  Acc@5: 98.4375 (99.5050)
Valid: 36 [ 150/390]  Loss: 0.3941 (0.368)  Acc@1: 85.9375 (87.6759)  Acc@5: 100.0000 (99.5033)
Valid: 36 [ 200/390]  Loss: 0.3119 (0.368)  Acc@1: 87.5000 (87.6632)  Acc@5: 98.4375 (99.5414)
Valid: 36 [ 250/390]  Loss: 0.3155 (0.369)  Acc@1: 87.5000 (87.6743)  Acc@5: 100.0000 (99.5518)
Valid: 36 [ 300/390]  Loss: 0.1002 (0.372)  Acc@1: 95.3125 (87.5831)  Acc@5: 100.0000 (99.5328)
Valid: 36 [ 350/390]  Loss: 0.2075 (0.371)  Acc@1: 90.6250 (87.6692)  Acc@5: 100.0000 (99.5103)
Valid: 36 [ 390/390]  Loss: 0.1974 (0.374)  Acc@1: 95.0000 (87.6080)  Acc@5: 100.0000 (99.5120)
valid_acc 87.608000
epoch = 36   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('sep_conv_3x3', 3)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1608, 0.0633, 0.0522, 0.1044, 0.2073, 0.1338, 0.1451, 0.1331],
        [0.1827, 0.0518, 0.0417, 0.0689, 0.2167, 0.1794, 0.1197, 0.1390],
        [0.2147, 0.0721, 0.0579, 0.1161, 0.1870, 0.1290, 0.0978, 0.1255],
        [0.2169, 0.0637, 0.0548, 0.0965, 0.1649, 0.1505, 0.1267, 0.1262],
        [0.2498, 0.0469, 0.0419, 0.0852, 0.1420, 0.1487, 0.1483, 0.1372],
        [0.2834, 0.0707, 0.0602, 0.1248, 0.1111, 0.1103, 0.1219, 0.1177],
        [0.3214, 0.0582, 0.0517, 0.0903, 0.1570, 0.1117, 0.1051, 0.1046],
        [0.3807, 0.0430, 0.0407, 0.0849, 0.1167, 0.1111, 0.1049, 0.1179],
        [0.5061, 0.0337, 0.0334, 0.0540, 0.0808, 0.1178, 0.0893, 0.0849],
        [0.3391, 0.0614, 0.0552, 0.1040, 0.1016, 0.1075, 0.1118, 0.1194],
        [0.4641, 0.0483, 0.0432, 0.0698, 0.1142, 0.0911, 0.0858, 0.0835],
        [0.4075, 0.0392, 0.0373, 0.0721, 0.1135, 0.0957, 0.1225, 0.1120],
        [0.4998, 0.0319, 0.0320, 0.0493, 0.0859, 0.0856, 0.1104, 0.1050],
        [0.5107, 0.0264, 0.0270, 0.0343, 0.0855, 0.0985, 0.0934, 0.1243]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1087, 0.1384, 0.1070, 0.1142, 0.1431, 0.1546, 0.1184, 0.1156],
        [0.1241, 0.1051, 0.0887, 0.1386, 0.1552, 0.1333, 0.1354, 0.1197],
        [0.1076, 0.1340, 0.1056, 0.1342, 0.1415, 0.1133, 0.1432, 0.1206],
        [0.1243, 0.1119, 0.1007, 0.1299, 0.1461, 0.1172, 0.1412, 0.1286],
        [0.1216, 0.0859, 0.0820, 0.1279, 0.1559, 0.1405, 0.1405, 0.1457],
        [0.1135, 0.1439, 0.1209, 0.1202, 0.1279, 0.1234, 0.1355, 0.1147],
        [0.1366, 0.1252, 0.1176, 0.1148, 0.1366, 0.1284, 0.1266, 0.1142],
        [0.1305, 0.0837, 0.0890, 0.1516, 0.1526, 0.1412, 0.1257, 0.1257],
        [0.1425, 0.0788, 0.0837, 0.1411, 0.1315, 0.1498, 0.1348, 0.1377],
        [0.1152, 0.1287, 0.1079, 0.1287, 0.1348, 0.1558, 0.1025, 0.1264],
        [0.1218, 0.1122, 0.1053, 0.1245, 0.1303, 0.1369, 0.1302, 0.1389],
        [0.1309, 0.0807, 0.0786, 0.1326, 0.1477, 0.1566, 0.1178, 0.1551],
        [0.1366, 0.0730, 0.0749, 0.1225, 0.1578, 0.1512, 0.1358, 0.1482],
        [0.1601, 0.0663, 0.0689, 0.0962, 0.1574, 0.1401, 0.1473, 0.1638]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 37 [   0/390]  Loss: 0.2767 (0.277)  Acc@1: 89.0625 (89.0625)  Acc@5: 98.4375 (98.4375)LR: 4.785e-03
Train: 37 [  50/390]  Loss: 0.4483 (0.249)  Acc@1: 85.9375 (91.3297)  Acc@5: 98.4375 (99.7549)LR: 4.785e-03
Train: 37 [ 100/390]  Loss: 0.1665 (0.257)  Acc@1: 95.3125 (91.1046)  Acc@5: 98.4375 (99.7525)LR: 4.785e-03
Train: 37 [ 150/390]  Loss: 0.4578 (0.252)  Acc@1: 82.8125 (91.2045)  Acc@5: 100.0000 (99.8344)LR: 4.785e-03
Train: 37 [ 200/390]  Loss: 0.3150 (0.261)  Acc@1: 90.6250 (90.9670)  Acc@5: 100.0000 (99.8057)LR: 4.785e-03
Train: 37 [ 250/390]  Loss: 0.2313 (0.263)  Acc@1: 92.1875 (90.9798)  Acc@5: 100.0000 (99.7883)LR: 4.785e-03
Train: 37 [ 300/390]  Loss: 0.1761 (0.258)  Acc@1: 93.7500 (90.9988)  Acc@5: 100.0000 (99.8027)LR: 4.785e-03
Train: 37 [ 350/390]  Loss: 0.2727 (0.257)  Acc@1: 85.9375 (91.0568)  Acc@5: 100.0000 (99.8219)LR: 4.785e-03
Train: 37 [ 390/390]  Loss: 0.4192 (0.257)  Acc@1: 82.5000 (91.1120)  Acc@5: 97.5000 (99.8080)LR: 4.785e-03
train_acc 91.112000
Valid: 37 [   0/390]  Loss: 0.5961 (0.596)  Acc@1: 81.2500 (81.2500)  Acc@5: 100.0000 (100.0000)
Valid: 37 [  50/390]  Loss: 0.4206 (0.363)  Acc@1: 85.9375 (87.6838)  Acc@5: 100.0000 (99.6017)
Valid: 37 [ 100/390]  Loss: 0.2840 (0.352)  Acc@1: 90.6250 (87.9332)  Acc@5: 100.0000 (99.6132)
Valid: 37 [ 150/390]  Loss: 0.3280 (0.355)  Acc@1: 85.9375 (87.9243)  Acc@5: 100.0000 (99.5654)
Valid: 37 [ 200/390]  Loss: 0.4722 (0.360)  Acc@1: 84.3750 (87.9198)  Acc@5: 98.4375 (99.5569)
Valid: 37 [ 250/390]  Loss: 0.3697 (0.359)  Acc@1: 85.9375 (88.0416)  Acc@5: 100.0000 (99.5767)
Valid: 37 [ 300/390]  Loss: 0.3268 (0.358)  Acc@1: 90.6250 (88.0762)  Acc@5: 100.0000 (99.5588)
Valid: 37 [ 350/390]  Loss: 0.3424 (0.356)  Acc@1: 90.6250 (88.1054)  Acc@5: 100.0000 (99.5459)
Valid: 37 [ 390/390]  Loss: 0.5730 (0.357)  Acc@1: 87.5000 (88.0760)  Acc@5: 97.5000 (99.5400)
valid_acc 88.076000
epoch = 37   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('skip_connect', 2), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('sep_conv_3x3', 3)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1620, 0.0618, 0.0514, 0.1038, 0.2107, 0.1330, 0.1454, 0.1319],
        [0.1854, 0.0508, 0.0411, 0.0682, 0.2177, 0.1800, 0.1190, 0.1379],
        [0.2179, 0.0708, 0.0573, 0.1164, 0.1864, 0.1289, 0.0970, 0.1252],
        [0.2206, 0.0621, 0.0539, 0.0957, 0.1660, 0.1499, 0.1261, 0.1256],
        [0.2549, 0.0457, 0.0413, 0.0843, 0.1417, 0.1467, 0.1474, 0.1380],
        [0.2903, 0.0696, 0.0596, 0.1254, 0.1094, 0.1076, 0.1216, 0.1165],
        [0.3327, 0.0569, 0.0506, 0.0893, 0.1560, 0.1086, 0.1024, 0.1035],
        [0.3916, 0.0420, 0.0399, 0.0836, 0.1153, 0.1097, 0.1030, 0.1148],
        [0.5197, 0.0327, 0.0326, 0.0526, 0.0790, 0.1147, 0.0864, 0.0824],
        [0.3494, 0.0598, 0.0540, 0.1029, 0.0997, 0.1073, 0.1099, 0.1170],
        [0.4804, 0.0468, 0.0420, 0.0682, 0.1104, 0.0882, 0.0831, 0.0809],
        [0.4189, 0.0382, 0.0365, 0.0706, 0.1115, 0.0936, 0.1198, 0.1108],
        [0.5152, 0.0310, 0.0312, 0.0480, 0.0836, 0.0820, 0.1068, 0.1022],
        [0.5257, 0.0254, 0.0262, 0.0334, 0.0831, 0.0955, 0.0900, 0.1207]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1086, 0.1393, 0.1072, 0.1128, 0.1431, 0.1550, 0.1183, 0.1157],
        [0.1241, 0.1034, 0.0875, 0.1391, 0.1559, 0.1347, 0.1349, 0.1204],
        [0.1079, 0.1336, 0.1054, 0.1329, 0.1410, 0.1127, 0.1458, 0.1207],
        [0.1245, 0.1105, 0.1000, 0.1304, 0.1465, 0.1171, 0.1419, 0.1291],
        [0.1210, 0.0842, 0.0814, 0.1268, 0.1583, 0.1411, 0.1416, 0.1455],
        [0.1131, 0.1432, 0.1207, 0.1189, 0.1283, 0.1233, 0.1366, 0.1158],
        [0.1363, 0.1238, 0.1167, 0.1150, 0.1372, 0.1287, 0.1278, 0.1145],
        [0.1318, 0.0828, 0.0893, 0.1530, 0.1524, 0.1395, 0.1256, 0.1257],
        [0.1423, 0.0781, 0.0833, 0.1408, 0.1326, 0.1507, 0.1345, 0.1378],
        [0.1145, 0.1277, 0.1072, 0.1278, 0.1361, 0.1570, 0.1026, 0.1271],
        [0.1214, 0.1107, 0.1045, 0.1233, 0.1312, 0.1377, 0.1314, 0.1397],
        [0.1313, 0.0791, 0.0783, 0.1317, 0.1470, 0.1571, 0.1187, 0.1569],
        [0.1370, 0.0718, 0.0745, 0.1219, 0.1584, 0.1514, 0.1360, 0.1491],
        [0.1605, 0.0649, 0.0683, 0.0954, 0.1586, 0.1406, 0.1482, 0.1635]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 38 [   0/390]  Loss: 0.3024 (0.302)  Acc@1: 90.6250 (90.6250)  Acc@5: 98.4375 (98.4375)LR: 4.252e-03
Train: 38 [  50/390]  Loss: 0.3458 (0.258)  Acc@1: 87.5000 (90.6556)  Acc@5: 100.0000 (99.9081)LR: 4.252e-03
Train: 38 [ 100/390]  Loss: 0.1387 (0.241)  Acc@1: 95.3125 (91.4140)  Acc@5: 100.0000 (99.8917)LR: 4.252e-03
Train: 38 [ 150/390]  Loss: 0.2539 (0.233)  Acc@1: 90.6250 (91.7529)  Acc@5: 98.4375 (99.8862)LR: 4.252e-03
Train: 38 [ 200/390]  Loss: 0.2036 (0.237)  Acc@1: 92.1875 (91.6667)  Acc@5: 100.0000 (99.8523)LR: 4.252e-03
Train: 38 [ 250/390]  Loss: 0.3920 (0.237)  Acc@1: 85.9375 (91.6646)  Acc@5: 100.0000 (99.8568)LR: 4.252e-03
Train: 38 [ 300/390]  Loss: 0.3204 (0.239)  Acc@1: 85.9375 (91.6321)  Acc@5: 100.0000 (99.8495)LR: 4.252e-03
Train: 38 [ 350/390]  Loss: 0.3308 (0.237)  Acc@1: 87.5000 (91.6934)  Acc@5: 100.0000 (99.8665)LR: 4.252e-03
Train: 38 [ 390/390]  Loss: 0.1770 (0.237)  Acc@1: 95.0000 (91.6080)  Acc@5: 100.0000 (99.8600)LR: 4.252e-03
train_acc 91.608000
Valid: 38 [   0/390]  Loss: 0.4061 (0.406)  Acc@1: 81.2500 (81.2500)  Acc@5: 100.0000 (100.0000)
Valid: 38 [  50/390]  Loss: 0.4697 (0.416)  Acc@1: 79.6875 (86.1213)  Acc@5: 100.0000 (99.5098)
Valid: 38 [ 100/390]  Loss: 0.5065 (0.402)  Acc@1: 82.8125 (86.6182)  Acc@5: 96.8750 (99.3967)
Valid: 38 [ 150/390]  Loss: 0.4032 (0.409)  Acc@1: 87.5000 (86.4963)  Acc@5: 100.0000 (99.4723)
Valid: 38 [ 200/390]  Loss: 0.2784 (0.411)  Acc@1: 90.6250 (86.5361)  Acc@5: 100.0000 (99.4092)
Valid: 38 [ 250/390]  Loss: 0.5512 (0.421)  Acc@1: 81.2500 (86.4293)  Acc@5: 100.0000 (99.3588)
Valid: 38 [ 300/390]  Loss: 0.4202 (0.418)  Acc@1: 85.9375 (86.5500)  Acc@5: 98.4375 (99.3667)
Valid: 38 [ 350/390]  Loss: 0.06599 (0.419)  Acc@1: 98.4375 (86.5652)  Acc@5: 100.0000 (99.3590)
Valid: 38 [ 390/390]  Loss: 0.1507 (0.413)  Acc@1: 95.0000 (86.7480)  Acc@5: 100.0000 (99.3760)
valid_acc 86.748000
epoch = 38   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('skip_connect', 2), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1652, 0.0598, 0.0507, 0.1036, 0.2131, 0.1323, 0.1447, 0.1305],
        [0.1860, 0.0494, 0.0406, 0.0674, 0.2191, 0.1828, 0.1183, 0.1363],
        [0.2235, 0.0687, 0.0565, 0.1163, 0.1856, 0.1291, 0.0958, 0.1245],
        [0.2240, 0.0606, 0.0534, 0.0950, 0.1657, 0.1508, 0.1256, 0.1249],
        [0.2607, 0.0442, 0.0404, 0.0830, 0.1407, 0.1462, 0.1463, 0.1385],
        [0.2981, 0.0679, 0.0590, 0.1265, 0.1069, 0.1066, 0.1206, 0.1144],
        [0.3436, 0.0554, 0.0498, 0.0884, 0.1529, 0.1070, 0.1006, 0.1024],
        [0.4024, 0.0409, 0.0391, 0.0827, 0.1132, 0.1080, 0.1009, 0.1128],
        [0.5321, 0.0318, 0.0318, 0.0511, 0.0765, 0.1124, 0.0837, 0.0807],
        [0.3620, 0.0580, 0.0531, 0.1028, 0.0965, 0.1052, 0.1079, 0.1145],
        [0.4956, 0.0448, 0.0407, 0.0663, 0.1072, 0.0852, 0.0812, 0.0790],
        [0.4329, 0.0367, 0.0355, 0.0690, 0.1089, 0.0911, 0.1169, 0.1090],
        [0.5332, 0.0299, 0.0302, 0.0464, 0.0811, 0.0791, 0.1022, 0.0980],
        [0.5402, 0.0245, 0.0254, 0.0326, 0.0798, 0.0930, 0.0866, 0.1180]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1080, 0.1391, 0.1075, 0.1126, 0.1433, 0.1552, 0.1191, 0.1153],
        [0.1245, 0.1022, 0.0871, 0.1381, 0.1574, 0.1348, 0.1357, 0.1201],
        [0.1078, 0.1325, 0.1053, 0.1328, 0.1420, 0.1120, 0.1463, 0.1213],
        [0.1242, 0.1099, 0.1003, 0.1311, 0.1480, 0.1170, 0.1420, 0.1273],
        [0.1208, 0.0826, 0.0810, 0.1266, 0.1579, 0.1433, 0.1419, 0.1459],
        [0.1125, 0.1426, 0.1212, 0.1178, 0.1304, 0.1231, 0.1367, 0.1158],
        [0.1372, 0.1229, 0.1172, 0.1138, 0.1394, 0.1282, 0.1277, 0.1137],
        [0.1322, 0.0812, 0.0888, 0.1536, 0.1527, 0.1393, 0.1266, 0.1256],
        [0.1433, 0.0762, 0.0827, 0.1406, 0.1333, 0.1515, 0.1340, 0.1384],
        [0.1145, 0.1265, 0.1070, 0.1272, 0.1366, 0.1583, 0.1027, 0.1273],
        [0.1205, 0.1095, 0.1045, 0.1229, 0.1324, 0.1373, 0.1326, 0.1403],
        [0.1315, 0.0774, 0.0776, 0.1313, 0.1463, 0.1584, 0.1187, 0.1588],
        [0.1371, 0.0701, 0.0743, 0.1220, 0.1583, 0.1526, 0.1358, 0.1498],
        [0.1610, 0.0632, 0.0675, 0.0940, 0.1603, 0.1417, 0.1472, 0.1651]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 39 [   0/390]  Loss: 0.2426 (0.243)  Acc@1: 90.6250 (90.6250)  Acc@5: 100.0000 (100.0000)LR: 3.754e-03
Train: 39 [  50/390]  Loss: 0.2761 (0.232)  Acc@1: 90.6250 (92.5245)  Acc@5: 98.4375 (99.9387)LR: 3.754e-03
Train: 39 [ 100/390]  Loss: 0.1210 (0.238)  Acc@1: 96.8750 (92.0483)  Acc@5: 100.0000 (99.8762)LR: 3.754e-03
Train: 39 [ 150/390]  Loss: 0.1923 (0.239)  Acc@1: 93.7500 (91.8253)  Acc@5: 100.0000 (99.8551)LR: 3.754e-03
Train: 39 [ 200/390]  Loss: 0.2927 (0.244)  Acc@1: 89.0625 (91.3946)  Acc@5: 100.0000 (99.8756)LR: 3.754e-03
Train: 39 [ 250/390]  Loss: 0.1872 (0.244)  Acc@1: 89.0625 (91.4841)  Acc@5: 100.0000 (99.8444)LR: 3.754e-03
Train: 39 [ 300/390]  Loss: 0.1676 (0.247)  Acc@1: 95.3125 (91.4037)  Acc@5: 100.0000 (99.8339)LR: 3.754e-03
Train: 39 [ 350/390]  Loss: 0.1987 (0.247)  Acc@1: 92.1875 (91.3684)  Acc@5: 100.0000 (99.8264)LR: 3.754e-03
Train: 39 [ 390/390]  Loss: 0.5343 (0.246)  Acc@1: 85.0000 (91.4280)  Acc@5: 97.5000 (99.8320)LR: 3.754e-03
train_acc 91.428000
Valid: 39 [   0/390]  Loss: 0.2160 (0.216)  Acc@1: 92.1875 (92.1875)  Acc@5: 98.4375 (98.4375)
Valid: 39 [  50/390]  Loss: 0.5015 (0.359)  Acc@1: 82.8125 (87.8064)  Acc@5: 100.0000 (99.5711)
Valid: 39 [ 100/390]  Loss: 0.4980 (0.371)  Acc@1: 81.2500 (87.6083)  Acc@5: 98.4375 (99.5668)
Valid: 39 [ 150/390]  Loss: 0.3716 (0.365)  Acc@1: 92.1875 (87.8104)  Acc@5: 100.0000 (99.5137)
Valid: 39 [ 200/390]  Loss: 0.3259 (0.363)  Acc@1: 89.0625 (87.9275)  Acc@5: 100.0000 (99.5414)
Valid: 39 [ 250/390]  Loss: 0.2418 (0.360)  Acc@1: 92.1875 (87.9669)  Acc@5: 100.0000 (99.5269)
Valid: 39 [ 300/390]  Loss: 0.4366 (0.359)  Acc@1: 87.5000 (88.0191)  Acc@5: 100.0000 (99.5380)
Valid: 39 [ 350/390]  Loss: 0.3413 (0.363)  Acc@1: 89.0625 (88.0386)  Acc@5: 98.4375 (99.4881)
Valid: 39 [ 390/390]  Loss: 0.3786 (0.363)  Acc@1: 90.0000 (88.0200)  Acc@5: 100.0000 (99.5040)
valid_acc 88.020000
epoch = 39   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('skip_connect', 2), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1689, 0.0585, 0.0501, 0.1032, 0.2139, 0.1320, 0.1440, 0.1295],
        [0.1868, 0.0483, 0.0403, 0.0669, 0.2194, 0.1839, 0.1184, 0.1360],
        [0.2305, 0.0673, 0.0560, 0.1161, 0.1838, 0.1285, 0.0942, 0.1236],
        [0.2269, 0.0594, 0.0530, 0.0946, 0.1656, 0.1508, 0.1260, 0.1237],
        [0.2646, 0.0431, 0.0396, 0.0814, 0.1395, 0.1460, 0.1476, 0.1381],
        [0.3089, 0.0660, 0.0579, 0.1255, 0.1049, 0.1048, 0.1199, 0.1121],
        [0.3542, 0.0539, 0.0491, 0.0874, 0.1503, 0.1051, 0.0987, 0.1013],
        [0.4140, 0.0398, 0.0380, 0.0806, 0.1117, 0.1070, 0.0994, 0.1094],
        [0.5466, 0.0310, 0.0310, 0.0496, 0.0739, 0.1094, 0.0802, 0.0783],
        [0.3742, 0.0560, 0.0518, 0.1007, 0.0943, 0.1051, 0.1055, 0.1123],
        [0.5097, 0.0433, 0.0397, 0.0645, 0.1041, 0.0827, 0.0791, 0.0769],
        [0.4441, 0.0359, 0.0348, 0.0674, 0.1074, 0.0891, 0.1145, 0.1068],
        [0.5491, 0.0289, 0.0293, 0.0448, 0.0790, 0.0765, 0.0981, 0.0943],
        [0.5520, 0.0238, 0.0248, 0.0318, 0.0771, 0.0910, 0.0838, 0.1158]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1074, 0.1395, 0.1075, 0.1120, 0.1434, 0.1559, 0.1189, 0.1154],
        [0.1251, 0.1019, 0.0868, 0.1371, 0.1587, 0.1355, 0.1364, 0.1185],
        [0.1077, 0.1317, 0.1049, 0.1312, 0.1430, 0.1120, 0.1477, 0.1218],
        [0.1239, 0.1098, 0.1004, 0.1300, 0.1512, 0.1168, 0.1413, 0.1266],
        [0.1207, 0.0809, 0.0804, 0.1264, 0.1578, 0.1457, 0.1416, 0.1464],
        [0.1119, 0.1420, 0.1209, 0.1173, 0.1320, 0.1220, 0.1380, 0.1159],
        [0.1384, 0.1226, 0.1167, 0.1140, 0.1386, 0.1276, 0.1292, 0.1130],
        [0.1328, 0.0800, 0.0881, 0.1541, 0.1529, 0.1386, 0.1271, 0.1263],
        [0.1442, 0.0749, 0.0815, 0.1402, 0.1337, 0.1517, 0.1340, 0.1398],
        [0.1146, 0.1257, 0.1061, 0.1266, 0.1373, 0.1586, 0.1026, 0.1286],
        [0.1199, 0.1098, 0.1047, 0.1215, 0.1335, 0.1372, 0.1328, 0.1405],
        [0.1307, 0.0758, 0.0767, 0.1298, 0.1459, 0.1603, 0.1187, 0.1621],
        [0.1362, 0.0688, 0.0732, 0.1207, 0.1604, 0.1563, 0.1354, 0.1490],
        [0.1626, 0.0623, 0.0669, 0.0932, 0.1617, 0.1428, 0.1458, 0.1646]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 40 [   0/390]  Loss: 0.3091 (0.309)  Acc@1: 87.5000 (87.5000)  Acc@5: 100.0000 (100.0000)LR: 3.292e-03
Train: 40 [  50/390]  Loss: 0.1779 (0.233)  Acc@1: 93.7500 (91.9424)  Acc@5: 100.0000 (99.9081)LR: 3.292e-03
Train: 40 [ 100/390]  Loss: 0.1172 (0.227)  Acc@1: 96.8750 (92.0637)  Acc@5: 100.0000 (99.8608)LR: 3.292e-03
Train: 40 [ 150/390]  Loss: 0.5046 (0.235)  Acc@1: 85.9375 (91.7943)  Acc@5: 98.4375 (99.8241)LR: 3.292e-03
Train: 40 [ 200/390]  Loss: 0.1977 (0.239)  Acc@1: 92.1875 (91.5812)  Acc@5: 100.0000 (99.7901)LR: 3.292e-03
Train: 40 [ 250/390]  Loss: 0.1305 (0.236)  Acc@1: 95.3125 (91.7019)  Acc@5: 100.0000 (99.7946)LR: 3.292e-03
Train: 40 [ 300/390]  Loss: 0.08046 (0.235)  Acc@1: 98.4375 (91.7774)  Acc@5: 100.0000 (99.8079)LR: 3.292e-03
Train: 40 [ 350/390]  Loss: 0.2179 (0.237)  Acc@1: 92.1875 (91.6444)  Acc@5: 100.0000 (99.8041)LR: 3.292e-03
Train: 40 [ 390/390]  Loss: 0.2478 (0.238)  Acc@1: 90.0000 (91.6360)  Acc@5: 100.0000 (99.8040)LR: 3.292e-03
train_acc 91.636000
Valid: 40 [   0/390]  Loss: 0.2227 (0.223)  Acc@1: 92.1875 (92.1875)  Acc@5: 100.0000 (100.0000)
Valid: 40 [  50/390]  Loss: 0.2660 (0.361)  Acc@1: 90.6250 (88.3578)  Acc@5: 100.0000 (99.4792)
Valid: 40 [ 100/390]  Loss: 0.1725 (0.360)  Acc@1: 93.7500 (88.4746)  Acc@5: 100.0000 (99.4740)
Valid: 40 [ 150/390]  Loss: 0.3437 (0.370)  Acc@1: 89.0625 (88.0795)  Acc@5: 100.0000 (99.4619)
Valid: 40 [ 200/390]  Loss: 0.2995 (0.370)  Acc@1: 93.7500 (88.0208)  Acc@5: 100.0000 (99.4403)
Valid: 40 [ 250/390]  Loss: 0.4134 (0.372)  Acc@1: 87.5000 (87.9482)  Acc@5: 95.3125 (99.4709)
Valid: 40 [ 300/390]  Loss: 0.1749 (0.372)  Acc@1: 93.7500 (87.8478)  Acc@5: 100.0000 (99.5224)
Valid: 40 [ 350/390]  Loss: 0.3399 (0.371)  Acc@1: 87.5000 (87.8428)  Acc@5: 100.0000 (99.5192)
Valid: 40 [ 390/390]  Loss: 0.3155 (0.371)  Acc@1: 87.5000 (87.8040)  Acc@5: 100.0000 (99.5120)
valid_acc 87.804000
epoch = 40   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('dil_conv_5x5', 4), ('dil_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1705, 0.0568, 0.0494, 0.1022, 0.2165, 0.1315, 0.1434, 0.1297],
        [0.1897, 0.0472, 0.0400, 0.0660, 0.2219, 0.1842, 0.1174, 0.1336],
        [0.2340, 0.0656, 0.0552, 0.1155, 0.1837, 0.1294, 0.0930, 0.1235],
        [0.2324, 0.0580, 0.0522, 0.0937, 0.1643, 0.1506, 0.1245, 0.1242],
        [0.2689, 0.0420, 0.0388, 0.0801, 0.1385, 0.1464, 0.1476, 0.1378],
        [0.3155, 0.0649, 0.0574, 0.1259, 0.1022, 0.1037, 0.1190, 0.1113],
        [0.3661, 0.0530, 0.0485, 0.0865, 0.1479, 0.1020, 0.0966, 0.0995],
        [0.4255, 0.0391, 0.0374, 0.0795, 0.1098, 0.1045, 0.0971, 0.1071],
        [0.5594, 0.0303, 0.0305, 0.0486, 0.0717, 0.1065, 0.0770, 0.0760],
        [0.3835, 0.0546, 0.0511, 0.1003, 0.0924, 0.1039, 0.1039, 0.1102],
        [0.5234, 0.0421, 0.0388, 0.0629, 0.1012, 0.0802, 0.0768, 0.0747],
        [0.4551, 0.0351, 0.0341, 0.0659, 0.1057, 0.0883, 0.1116, 0.1043],
        [0.5637, 0.0282, 0.0287, 0.0439, 0.0764, 0.0740, 0.0938, 0.0913],
        [0.5635, 0.0231, 0.0243, 0.0313, 0.0747, 0.0896, 0.0804, 0.1131]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1082, 0.1404, 0.1075, 0.1115, 0.1421, 0.1545, 0.1194, 0.1164],
        [0.1240, 0.1006, 0.0855, 0.1372, 0.1598, 0.1368, 0.1375, 0.1186],
        [0.1077, 0.1315, 0.1045, 0.1299, 0.1431, 0.1125, 0.1493, 0.1214],
        [0.1246, 0.1085, 0.0993, 0.1308, 0.1519, 0.1172, 0.1414, 0.1263],
        [0.1206, 0.0798, 0.0801, 0.1257, 0.1580, 0.1467, 0.1421, 0.1471],
        [0.1124, 0.1422, 0.1208, 0.1173, 0.1318, 0.1214, 0.1389, 0.1153],
        [0.1384, 0.1218, 0.1160, 0.1137, 0.1395, 0.1274, 0.1296, 0.1135],
        [0.1327, 0.0792, 0.0880, 0.1543, 0.1550, 0.1381, 0.1272, 0.1256],
        [0.1434, 0.0738, 0.0810, 0.1391, 0.1340, 0.1527, 0.1350, 0.1411],
        [0.1147, 0.1257, 0.1058, 0.1259, 0.1378, 0.1591, 0.1025, 0.1285],
        [0.1196, 0.1088, 0.1042, 0.1219, 0.1331, 0.1374, 0.1332, 0.1419],
        [0.1312, 0.0751, 0.0766, 0.1297, 0.1448, 0.1603, 0.1197, 0.1628],
        [0.1353, 0.0678, 0.0727, 0.1195, 0.1614, 0.1564, 0.1363, 0.1506],
        [0.1623, 0.0618, 0.0667, 0.0925, 0.1637, 0.1434, 0.1459, 0.1638]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 41 [   0/390]  Loss: 0.1021 (0.102)  Acc@1: 98.4375 (98.4375)  Acc@5: 100.0000 (100.0000)LR: 2.868e-03
Train: 41 [  50/390]  Loss: 0.1778 (0.240)  Acc@1: 95.3125 (91.4522)  Acc@5: 100.0000 (99.8468)LR: 2.868e-03
Train: 41 [ 100/390]  Loss: 0.2392 (0.235)  Acc@1: 89.0625 (91.6460)  Acc@5: 100.0000 (99.8608)LR: 2.868e-03
Train: 41 [ 150/390]  Loss: 0.1536 (0.231)  Acc@1: 93.7500 (91.8874)  Acc@5: 100.0000 (99.8758)LR: 2.868e-03
Train: 41 [ 200/390]  Loss: 0.2519 (0.233)  Acc@1: 92.1875 (91.9232)  Acc@5: 100.0000 (99.8834)LR: 2.868e-03
Train: 41 [ 250/390]  Loss: 0.2100 (0.230)  Acc@1: 92.1875 (91.9634)  Acc@5: 100.0000 (99.8755)LR: 2.868e-03
Train: 41 [ 300/390]  Loss: 0.3479 (0.233)  Acc@1: 90.6250 (91.8241)  Acc@5: 100.0000 (99.8547)LR: 2.868e-03
Train: 41 [ 350/390]  Loss: 0.2067 (0.234)  Acc@1: 92.1875 (91.7245)  Acc@5: 100.0000 (99.8575)LR: 2.868e-03
Train: 41 [ 390/390]  Loss: 0.2049 (0.235)  Acc@1: 95.0000 (91.6400)  Acc@5: 100.0000 (99.8520)LR: 2.868e-03
train_acc 91.640000
Valid: 41 [   0/390]  Loss: 0.4027 (0.403)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)
Valid: 41 [  50/390]  Loss: 0.3853 (0.367)  Acc@1: 89.0625 (87.7145)  Acc@5: 100.0000 (99.5711)
Valid: 41 [ 100/390]  Loss: 0.5561 (0.376)  Acc@1: 87.5000 (87.6547)  Acc@5: 100.0000 (99.5978)
Valid: 41 [ 150/390]  Loss: 0.4640 (0.377)  Acc@1: 85.9375 (87.8001)  Acc@5: 100.0000 (99.5654)
Valid: 41 [ 200/390]  Loss: 0.3503 (0.372)  Acc@1: 89.0625 (87.9353)  Acc@5: 100.0000 (99.5647)
Valid: 41 [ 250/390]  Loss: 0.3577 (0.377)  Acc@1: 84.3750 (87.8548)  Acc@5: 100.0000 (99.5331)
Valid: 41 [ 300/390]  Loss: 0.3787 (0.368)  Acc@1: 89.0625 (88.0139)  Acc@5: 100.0000 (99.5224)
Valid: 41 [ 350/390]  Loss: 0.4067 (0.366)  Acc@1: 87.5000 (88.1232)  Acc@5: 98.4375 (99.5326)
Valid: 41 [ 390/390]  Loss: 0.1467 (0.369)  Acc@1: 95.0000 (88.0120)  Acc@5: 100.0000 (99.5400)
valid_acc 88.012000
epoch = 41   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('skip_connect', 2), ('sep_conv_5x5', 3), ('sep_conv_3x3', 4), ('dil_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1717, 0.0553, 0.0485, 0.1010, 0.2180, 0.1324, 0.1433, 0.1297],
        [0.1925, 0.0459, 0.0394, 0.0652, 0.2210, 0.1850, 0.1182, 0.1328],
        [0.2388, 0.0641, 0.0543, 0.1143, 0.1838, 0.1298, 0.0916, 0.1233],
        [0.2365, 0.0567, 0.0514, 0.0926, 0.1648, 0.1511, 0.1240, 0.1229],
        [0.2749, 0.0411, 0.0383, 0.0794, 0.1374, 0.1451, 0.1474, 0.1365],
        [0.3229, 0.0638, 0.0567, 0.1254, 0.1004, 0.1037, 0.1175, 0.1095],
        [0.3787, 0.0514, 0.0473, 0.0847, 0.1446, 0.1004, 0.0947, 0.0982],
        [0.4370, 0.0382, 0.0368, 0.0783, 0.1074, 0.1035, 0.0947, 0.1041],
        [0.5737, 0.0294, 0.0296, 0.0470, 0.0696, 0.1032, 0.0739, 0.0737],
        [0.3949, 0.0534, 0.0502, 0.0993, 0.0899, 0.1031, 0.1018, 0.1073],
        [0.5378, 0.0408, 0.0380, 0.0617, 0.0977, 0.0775, 0.0743, 0.0723],
        [0.4691, 0.0341, 0.0336, 0.0652, 0.1031, 0.0848, 0.1079, 0.1022],
        [0.5812, 0.0271, 0.0279, 0.0427, 0.0732, 0.0711, 0.0895, 0.0873],
        [0.5785, 0.0224, 0.0236, 0.0305, 0.0719, 0.0866, 0.0767, 0.1097]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1091, 0.1411, 0.1082, 0.1113, 0.1409, 0.1535, 0.1196, 0.1163],
        [0.1225, 0.1002, 0.0853, 0.1374, 0.1603, 0.1381, 0.1380, 0.1182],
        [0.1079, 0.1307, 0.1043, 0.1283, 0.1428, 0.1132, 0.1502, 0.1226],
        [0.1243, 0.1085, 0.0995, 0.1302, 0.1524, 0.1181, 0.1411, 0.1259],
        [0.1212, 0.0790, 0.0803, 0.1263, 0.1590, 0.1466, 0.1408, 0.1468],
        [0.1124, 0.1409, 0.1200, 0.1164, 0.1324, 0.1214, 0.1405, 0.1159],
        [0.1384, 0.1223, 0.1170, 0.1135, 0.1395, 0.1265, 0.1289, 0.1139],
        [0.1326, 0.0785, 0.0884, 0.1555, 0.1552, 0.1369, 0.1268, 0.1261],
        [0.1425, 0.0726, 0.0806, 0.1381, 0.1339, 0.1543, 0.1350, 0.1431],
        [0.1158, 0.1244, 0.1052, 0.1255, 0.1389, 0.1587, 0.1030, 0.1285],
        [0.1189, 0.1083, 0.1043, 0.1214, 0.1329, 0.1373, 0.1344, 0.1426],
        [0.1306, 0.0737, 0.0760, 0.1284, 0.1441, 0.1620, 0.1210, 0.1643],
        [0.1359, 0.0665, 0.0723, 0.1185, 0.1623, 0.1574, 0.1356, 0.1516],
        [0.1626, 0.0609, 0.0664, 0.0919, 0.1652, 0.1440, 0.1462, 0.1629]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 42 [   0/390]  Loss: 0.2555 (0.255)  Acc@1: 87.5000 (87.5000)  Acc@5: 100.0000 (100.0000)LR: 2.484e-03
Train: 42 [  50/390]  Loss: 0.1937 (0.234)  Acc@1: 93.7500 (91.7892)  Acc@5: 98.4375 (99.7855)LR: 2.484e-03
Train: 42 [ 100/390]  Loss: 0.1142 (0.238)  Acc@1: 98.4375 (91.6460)  Acc@5: 100.0000 (99.7989)LR: 2.484e-03
Train: 42 [ 150/390]  Loss: 0.1164 (0.228)  Acc@1: 98.4375 (92.1254)  Acc@5: 100.0000 (99.8344)LR: 2.484e-03
Train: 42 [ 200/390]  Loss: 0.1110 (0.222)  Acc@1: 96.8750 (92.2108)  Acc@5: 100.0000 (99.8601)LR: 2.484e-03
Train: 42 [ 250/390]  Loss: 0.2003 (0.225)  Acc@1: 93.7500 (92.1501)  Acc@5: 100.0000 (99.8693)LR: 2.484e-03
Train: 42 [ 300/390]  Loss: 0.1624 (0.223)  Acc@1: 93.7500 (92.2186)  Acc@5: 100.0000 (99.8650)LR: 2.484e-03
Train: 42 [ 350/390]  Loss: 0.1346 (0.224)  Acc@1: 95.3125 (92.1430)  Acc@5: 100.0000 (99.8754)LR: 2.484e-03
Train: 42 [ 390/390]  Loss: 0.1861 (0.227)  Acc@1: 92.5000 (92.0560)  Acc@5: 100.0000 (99.8800)LR: 2.484e-03
train_acc 92.056000
Valid: 42 [   0/390]  Loss: 0.3225 (0.323)  Acc@1: 90.6250 (90.6250)  Acc@5: 100.0000 (100.0000)
Valid: 42 [  50/390]  Loss: 0.3790 (0.346)  Acc@1: 87.5000 (88.4804)  Acc@5: 98.4375 (99.4485)
Valid: 42 [ 100/390]  Loss: 0.3146 (0.347)  Acc@1: 92.1875 (88.3199)  Acc@5: 98.4375 (99.4431)
Valid: 42 [ 150/390]  Loss: 0.4019 (0.349)  Acc@1: 90.6250 (88.4727)  Acc@5: 100.0000 (99.4930)
Valid: 42 [ 200/390]  Loss: 0.3496 (0.350)  Acc@1: 89.0625 (88.1608)  Acc@5: 100.0000 (99.5103)
Valid: 42 [ 250/390]  Loss: 0.4261 (0.355)  Acc@1: 82.8125 (88.0789)  Acc@5: 98.4375 (99.5082)
Valid: 42 [ 300/390]  Loss: 0.3802 (0.356)  Acc@1: 89.0625 (88.1281)  Acc@5: 100.0000 (99.5276)
Valid: 42 [ 350/390]  Loss: 0.2395 (0.359)  Acc@1: 90.6250 (87.9986)  Acc@5: 100.0000 (99.5370)
Valid: 42 [ 390/390]  Loss: 0.2052 (0.358)  Acc@1: 92.5000 (88.0480)  Acc@5: 100.0000 (99.5360)
valid_acc 88.048000
epoch = 42   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 4), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('skip_connect', 2), ('sep_conv_5x5', 3), ('sep_conv_3x3', 4), ('sep_conv_3x3', 3)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1729, 0.0543, 0.0480, 0.1008, 0.2191, 0.1318, 0.1434, 0.1297],
        [0.1953, 0.0447, 0.0387, 0.0641, 0.2222, 0.1859, 0.1182, 0.1310],
        [0.2429, 0.0627, 0.0534, 0.1133, 0.1844, 0.1301, 0.0899, 0.1234],
        [0.2421, 0.0553, 0.0504, 0.0912, 0.1629, 0.1503, 0.1245, 0.1234],
        [0.2807, 0.0403, 0.0379, 0.0785, 0.1361, 0.1431, 0.1467, 0.1367],
        [0.3294, 0.0628, 0.0559, 0.1254, 0.0986, 0.1024, 0.1169, 0.1086],
        [0.3922, 0.0498, 0.0461, 0.0827, 0.1424, 0.0977, 0.0922, 0.0969],
        [0.4490, 0.0375, 0.0362, 0.0770, 0.1052, 0.1020, 0.0922, 0.1009],
        [0.5877, 0.0287, 0.0289, 0.0457, 0.0670, 0.1003, 0.0711, 0.0706],
        [0.4045, 0.0523, 0.0494, 0.0990, 0.0878, 0.1014, 0.1008, 0.1048],
        [0.5529, 0.0393, 0.0368, 0.0598, 0.0949, 0.0746, 0.0718, 0.0699],
        [0.4807, 0.0334, 0.0331, 0.0640, 0.1005, 0.0828, 0.1049, 0.1006],
        [0.5980, 0.0264, 0.0272, 0.0417, 0.0701, 0.0680, 0.0855, 0.0831],
        [0.5922, 0.0218, 0.0232, 0.0302, 0.0688, 0.0844, 0.0732, 0.1061]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1095, 0.1402, 0.1076, 0.1111, 0.1417, 0.1534, 0.1206, 0.1159],
        [0.1218, 0.0987, 0.0840, 0.1376, 0.1611, 0.1392, 0.1392, 0.1184],
        [0.1082, 0.1289, 0.1033, 0.1275, 0.1427, 0.1135, 0.1523, 0.1236],
        [0.1228, 0.1067, 0.0979, 0.1306, 0.1535, 0.1194, 0.1426, 0.1264],
        [0.1212, 0.0779, 0.0799, 0.1254, 0.1608, 0.1479, 0.1402, 0.1466],
        [0.1129, 0.1398, 0.1201, 0.1164, 0.1325, 0.1211, 0.1408, 0.1163],
        [0.1386, 0.1218, 0.1168, 0.1137, 0.1393, 0.1273, 0.1288, 0.1137],
        [0.1326, 0.0774, 0.0885, 0.1566, 0.1559, 0.1358, 0.1275, 0.1256],
        [0.1427, 0.0718, 0.0807, 0.1391, 0.1336, 0.1532, 0.1357, 0.1432],
        [0.1160, 0.1233, 0.1047, 0.1245, 0.1398, 0.1602, 0.1027, 0.1288],
        [0.1190, 0.1068, 0.1028, 0.1213, 0.1330, 0.1379, 0.1362, 0.1430],
        [0.1306, 0.0728, 0.0761, 0.1286, 0.1439, 0.1628, 0.1210, 0.1643],
        [0.1349, 0.0657, 0.0720, 0.1181, 0.1651, 0.1575, 0.1344, 0.1524],
        [0.1626, 0.0599, 0.0659, 0.0908, 0.1667, 0.1435, 0.1462, 0.1644]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 43 [   0/390]  Loss: 0.2453 (0.245)  Acc@1: 92.1875 (92.1875)  Acc@5: 100.0000 (100.0000)LR: 2.142e-03
Train: 43 [  50/390]  Loss: 0.1065 (0.238)  Acc@1: 98.4375 (92.0650)  Acc@5: 100.0000 (99.8162)LR: 2.142e-03
Train: 43 [ 100/390]  Loss: 0.3602 (0.246)  Acc@1: 84.3750 (91.7079)  Acc@5: 100.0000 (99.7679)LR: 2.142e-03
Train: 43 [ 150/390]  Loss: 0.2547 (0.236)  Acc@1: 95.3125 (92.0426)  Acc@5: 100.0000 (99.8137)LR: 2.142e-03
Train: 43 [ 200/390]  Loss: 0.2227 (0.236)  Acc@1: 92.1875 (91.8221)  Acc@5: 100.0000 (99.8057)LR: 2.142e-03
Train: 43 [ 250/390]  Loss: 0.1733 (0.236)  Acc@1: 93.7500 (91.8078)  Acc@5: 100.0000 (99.8381)LR: 2.142e-03
Train: 43 [ 300/390]  Loss: 0.1797 (0.232)  Acc@1: 93.7500 (91.9331)  Acc@5: 100.0000 (99.8391)LR: 2.142e-03
Train: 43 [ 350/390]  Loss: 0.2587 (0.230)  Acc@1: 93.7500 (91.9338)  Acc@5: 100.0000 (99.8575)LR: 2.142e-03
Train: 43 [ 390/390]  Loss: 0.3762 (0.234)  Acc@1: 87.5000 (91.7720)  Acc@5: 100.0000 (99.8560)LR: 2.142e-03
train_acc 91.772000
Valid: 43 [   0/390]  Loss: 0.4970 (0.497)  Acc@1: 79.6875 (79.6875)  Acc@5: 100.0000 (100.0000)
Valid: 43 [  50/390]  Loss: 0.3778 (0.371)  Acc@1: 89.0625 (87.5613)  Acc@5: 100.0000 (99.5098)
Valid: 43 [ 100/390]  Loss: 0.3833 (0.375)  Acc@1: 85.9375 (87.3298)  Acc@5: 98.4375 (99.3967)
Valid: 43 [ 150/390]  Loss: 0.2814 (0.377)  Acc@1: 89.0625 (87.4483)  Acc@5: 100.0000 (99.3791)
Valid: 43 [ 200/390]  Loss: 0.7107 (0.384)  Acc@1: 76.5625 (87.2979)  Acc@5: 98.4375 (99.3703)
Valid: 43 [ 250/390]  Loss: 0.5594 (0.388)  Acc@1: 78.1250 (87.2572)  Acc@5: 98.4375 (99.3152)
Valid: 43 [ 300/390]  Loss: 0.3297 (0.382)  Acc@1: 89.0625 (87.5104)  Acc@5: 100.0000 (99.3252)
Valid: 43 [ 350/390]  Loss: 0.3407 (0.383)  Acc@1: 85.9375 (87.3843)  Acc@5: 100.0000 (99.3545)
Valid: 43 [ 390/390]  Loss: 0.3275 (0.384)  Acc@1: 87.5000 (87.3440)  Acc@5: 100.0000 (99.3720)
valid_acc 87.344000
epoch = 43   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 0), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('dil_conv_3x3', 0), ('skip_connect', 2), ('sep_conv_5x5', 3), ('sep_conv_3x3', 4), ('sep_conv_3x3', 3)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1744, 0.0534, 0.0476, 0.1007, 0.2214, 0.1303, 0.1427, 0.1295],
        [0.1982, 0.0436, 0.0382, 0.0632, 0.2222, 0.1874, 0.1171, 0.1301],
        [0.2470, 0.0614, 0.0527, 0.1125, 0.1841, 0.1296, 0.0898, 0.1229],
        [0.2467, 0.0542, 0.0500, 0.0906, 0.1626, 0.1505, 0.1233, 0.1220],
        [0.2875, 0.0395, 0.0374, 0.0775, 0.1346, 0.1416, 0.1460, 0.1360],
        [0.3359, 0.0620, 0.0555, 0.1260, 0.0969, 0.1005, 0.1168, 0.1063],
        [0.4075, 0.0485, 0.0451, 0.0814, 0.1392, 0.0946, 0.0895, 0.0942],
        [0.4616, 0.0366, 0.0356, 0.0755, 0.1026, 0.1001, 0.0895, 0.0985],
        [0.6004, 0.0279, 0.0283, 0.0445, 0.0647, 0.0976, 0.0684, 0.0683],
        [0.4122, 0.0511, 0.0487, 0.0982, 0.0865, 0.1004, 0.0992, 0.1036],
        [0.5678, 0.0381, 0.0360, 0.0584, 0.0918, 0.0716, 0.0688, 0.0675],
        [0.4940, 0.0326, 0.0324, 0.0628, 0.0979, 0.0806, 0.1019, 0.0979],
        [0.6122, 0.0256, 0.0265, 0.0406, 0.0675, 0.0658, 0.0820, 0.0797],
        [0.6067, 0.0212, 0.0226, 0.0297, 0.0662, 0.0818, 0.0700, 0.1017]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1097, 0.1408, 0.1074, 0.1104, 0.1417, 0.1555, 0.1191, 0.1154],
        [0.1213, 0.0984, 0.0835, 0.1373, 0.1630, 0.1400, 0.1385, 0.1180],
        [0.1077, 0.1278, 0.1025, 0.1271, 0.1430, 0.1136, 0.1559, 0.1225],
        [0.1224, 0.1058, 0.0970, 0.1297, 0.1552, 0.1205, 0.1429, 0.1265],
        [0.1211, 0.0768, 0.0795, 0.1253, 0.1625, 0.1479, 0.1396, 0.1473],
        [0.1132, 0.1394, 0.1198, 0.1165, 0.1337, 0.1202, 0.1409, 0.1163],
        [0.1389, 0.1214, 0.1162, 0.1141, 0.1394, 0.1271, 0.1289, 0.1140],
        [0.1329, 0.0766, 0.0881, 0.1571, 0.1568, 0.1357, 0.1279, 0.1250],
        [0.1429, 0.0711, 0.0805, 0.1393, 0.1343, 0.1527, 0.1354, 0.1438],
        [0.1160, 0.1222, 0.1039, 0.1241, 0.1415, 0.1623, 0.1017, 0.1283],
        [0.1185, 0.1070, 0.1028, 0.1211, 0.1339, 0.1374, 0.1364, 0.1430],
        [0.1307, 0.0722, 0.0759, 0.1291, 0.1441, 0.1616, 0.1216, 0.1648],
        [0.1352, 0.0653, 0.0717, 0.1181, 0.1664, 0.1566, 0.1343, 0.1523],
        [0.1631, 0.0592, 0.0652, 0.0898, 0.1687, 0.1445, 0.1460, 0.1636]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 44 [   0/390]  Loss: 0.3439 (0.344)  Acc@1: 89.0625 (89.0625)  Acc@5: 100.0000 (100.0000)LR: 1.843e-03
Train: 44 [  50/390]  Loss: 0.2696 (0.226)  Acc@1: 89.0625 (92.2181)  Acc@5: 100.0000 (99.8775)LR: 1.843e-03
Train: 44 [ 100/390]  Loss: 0.2017 (0.228)  Acc@1: 95.3125 (92.1411)  Acc@5: 98.4375 (99.8298)LR: 1.843e-03
Train: 44 [ 150/390]  Loss: 0.1885 (0.225)  Acc@1: 93.7500 (92.2703)  Acc@5: 100.0000 (99.8448)LR: 1.843e-03
Train: 44 [ 200/390]  Loss: 0.1932 (0.231)  Acc@1: 93.7500 (92.0631)  Acc@5: 100.0000 (99.8212)LR: 1.843e-03
Train: 44 [ 250/390]  Loss: 0.2589 (0.233)  Acc@1: 92.1875 (91.9696)  Acc@5: 100.0000 (99.8319)LR: 1.843e-03
Train: 44 [ 300/390]  Loss: 0.5690 (0.238)  Acc@1: 82.8125 (91.8241)  Acc@5: 96.8750 (99.8079)LR: 1.843e-03
Train: 44 [ 350/390]  Loss: 0.2804 (0.238)  Acc@1: 89.0625 (91.8403)  Acc@5: 100.0000 (99.8130)LR: 1.843e-03
Train: 44 [ 390/390]  Loss: 0.1573 (0.238)  Acc@1: 95.0000 (91.7760)  Acc@5: 100.0000 (99.8200)LR: 1.843e-03
train_acc 91.776000
Valid: 44 [   0/390]  Loss: 0.1774 (0.177)  Acc@1: 93.7500 (93.7500)  Acc@5: 100.0000 (100.0000)
Valid: 44 [  50/390]  Loss: 0.4328 (0.374)  Acc@1: 89.0625 (87.2243)  Acc@5: 95.3125 (99.2953)
Valid: 44 [ 100/390]  Loss: 0.3833 (0.362)  Acc@1: 87.5000 (87.7785)  Acc@5: 100.0000 (99.4895)
Valid: 44 [ 150/390]  Loss: 0.3638 (0.361)  Acc@1: 84.3750 (87.9863)  Acc@5: 100.0000 (99.4930)
Valid: 44 [ 200/390]  Loss: 0.4499 (0.371)  Acc@1: 82.8125 (87.9042)  Acc@5: 100.0000 (99.4325)
Valid: 44 [ 250/390]  Loss: 0.5483 (0.370)  Acc@1: 85.9375 (87.9046)  Acc@5: 98.4375 (99.4460)
Valid: 44 [ 300/390]  Loss: 0.2744 (0.376)  Acc@1: 90.6250 (87.7180)  Acc@5: 100.0000 (99.4498)
Valid: 44 [ 350/390]  Loss: 0.3834 (0.374)  Acc@1: 84.3750 (87.7804)  Acc@5: 96.8750 (99.4525)
Valid: 44 [ 390/390]  Loss: 0.1576 (0.373)  Acc@1: 95.0000 (87.8440)  Acc@5: 100.0000 (99.4600)
valid_acc 87.844000
epoch = 44   
 genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 0), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('dil_conv_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('sep_conv_3x3', 4), ('sep_conv_3x3', 3)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1759, 0.0524, 0.0471, 0.0999, 0.2231, 0.1287, 0.1428, 0.1301],
        [0.2010, 0.0427, 0.0378, 0.0627, 0.2231, 0.1872, 0.1162, 0.1293],
        [0.2536, 0.0600, 0.0518, 0.1111, 0.1819, 0.1301, 0.0891, 0.1223],
        [0.2504, 0.0533, 0.0498, 0.0906, 0.1615, 0.1505, 0.1230, 0.1209],
        [0.2929, 0.0385, 0.0368, 0.0759, 0.1335, 0.1408, 0.1465, 0.1351],
        [0.3442, 0.0607, 0.0547, 0.1249, 0.0947, 0.0994, 0.1159, 0.1053],
        [0.4193, 0.0474, 0.0446, 0.0806, 0.1355, 0.0922, 0.0880, 0.0924],
        [0.4722, 0.0358, 0.0350, 0.0741, 0.1003, 0.0985, 0.0877, 0.0964],
        [0.6102, 0.0274, 0.0279, 0.0436, 0.0630, 0.0958, 0.0658, 0.0663],
        [0.4218, 0.0499, 0.0480, 0.0973, 0.0846, 0.0997, 0.0972, 0.1015],
        [0.5799, 0.0370, 0.0353, 0.0572, 0.0890, 0.0697, 0.0664, 0.0655],
        [0.5073, 0.0317, 0.0319, 0.0613, 0.0956, 0.0781, 0.0987, 0.0952],
        [0.6264, 0.0250, 0.0261, 0.0398, 0.0650, 0.0632, 0.0781, 0.0762],
        [0.6185, 0.0208, 0.0223, 0.0294, 0.0634, 0.0794, 0.0674, 0.0988]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1092, 0.1415, 0.1076, 0.1097, 0.1423, 0.1557, 0.1186, 0.1155],
        [0.1217, 0.0974, 0.0825, 0.1373, 0.1641, 0.1408, 0.1390, 0.1172],
        [0.1077, 0.1279, 0.1031, 0.1262, 0.1426, 0.1139, 0.1573, 0.1214],
        [0.1226, 0.1045, 0.0958, 0.1301, 0.1559, 0.1212, 0.1429, 0.1271],
        [0.1210, 0.0759, 0.0797, 0.1251, 0.1649, 0.1474, 0.1388, 0.1473],
        [0.1129, 0.1390, 0.1197, 0.1172, 0.1347, 0.1191, 0.1413, 0.1160],
        [0.1393, 0.1201, 0.1147, 0.1142, 0.1401, 0.1284, 0.1292, 0.1139],
        [0.1324, 0.0757, 0.0880, 0.1572, 0.1584, 0.1351, 0.1292, 0.1241],
        [0.1421, 0.0704, 0.0801, 0.1386, 0.1344, 0.1530, 0.1364, 0.1450],
        [0.1166, 0.1212, 0.1035, 0.1233, 0.1422, 0.1629, 0.1016, 0.1288],
        [0.1190, 0.1057, 0.1018, 0.1212, 0.1339, 0.1377, 0.1379, 0.1428],
        [0.1296, 0.0711, 0.0751, 0.1276, 0.1451, 0.1629, 0.1223, 0.1664],
        [0.1354, 0.0645, 0.0713, 0.1174, 0.1671, 0.1567, 0.1347, 0.1529],
        [0.1629, 0.0587, 0.0649, 0.0890, 0.1711, 0.1454, 0.1450, 0.1631]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 45 [   0/390]  Loss: 0.1988 (0.199)  Acc@1: 92.1875 (92.1875)  Acc@5: 100.0000 (100.0000)LR: 1.587e-03
Train: 45 [  50/390]  Loss: 0.3019 (0.227)  Acc@1: 90.6250 (92.0037)  Acc@5: 100.0000 (99.8468)LR: 1.587e-03
Train: 45 [ 100/390]  Loss: 0.1867 (0.223)  Acc@1: 90.6250 (92.2030)  Acc@5: 100.0000 (99.8453)LR: 1.587e-03
Train: 45 [ 150/390]  Loss: 0.1801 (0.232)  Acc@1: 87.5000 (91.9081)  Acc@5: 100.0000 (99.8551)LR: 1.587e-03
Train: 45 [ 200/390]  Loss: 0.3088 (0.235)  Acc@1: 89.0625 (91.7988)  Acc@5: 100.0000 (99.8368)LR: 1.587e-03
Train: 45 [ 250/390]  Loss: 0.1834 (0.239)  Acc@1: 93.7500 (91.7206)  Acc@5: 100.0000 (99.8444)LR: 1.587e-03
Train: 45 [ 300/390]  Loss: 0.2363 (0.244)  Acc@1: 89.0625 (91.5594)  Acc@5: 100.0000 (99.8131)LR: 1.587e-03
Train: 45 [ 350/390]  Loss: 0.2588 (0.244)  Acc@1: 89.0625 (91.5687)  Acc@5: 100.0000 (99.8175)LR: 1.587e-03
Train: 45 [ 390/390]  Loss: 0.3167 (0.243)  Acc@1: 85.0000 (91.5560)  Acc@5: 100.0000 (99.8280)LR: 1.587e-03
train_acc 91.556000
Valid: 45 [   0/390]  Loss: 0.4421 (0.442)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)
Valid: 45 [  50/390]  Loss: 0.4346 (0.414)  Acc@1: 85.9375 (86.7034)  Acc@5: 100.0000 (99.2647)
Valid: 45 [ 100/390]  Loss: 0.2955 (0.414)  Acc@1: 89.0625 (86.4016)  Acc@5: 100.0000 (99.3038)
Valid: 45 [ 150/390]  Loss: 0.2818 (0.417)  Acc@1: 87.5000 (86.5066)  Acc@5: 100.0000 (99.2446)
Valid: 45 [ 200/390]  Loss: 0.4101 (0.418)  Acc@1: 92.1875 (86.4894)  Acc@5: 98.4375 (99.2537)
Valid: 45 [ 250/390]  Loss: 0.6191 (0.414)  Acc@1: 85.9375 (86.7343)  Acc@5: 100.0000 (99.2343)
Valid: 45 [ 300/390]  Loss: 0.5474 (0.414)  Acc@1: 82.8125 (86.6435)  Acc@5: 96.8750 (99.1798)
Valid: 45 [ 350/390]  Loss: 0.3705 (0.409)  Acc@1: 84.3750 (86.7254)  Acc@5: 100.0000 (99.2343)
Valid: 45 [ 390/390]  Loss: 0.4245 (0.411)  Acc@1: 90.0000 (86.6480)  Acc@5: 97.5000 (99.2240)
valid_acc 86.648000
epoch = 45   
 genotype = Genotype(normal=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('dil_conv_5x5', 0), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('dil_conv_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('sep_conv_3x3', 4), ('dil_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1783, 0.0511, 0.0463, 0.0989, 0.2257, 0.1269, 0.1427, 0.1301],
        [0.2025, 0.0423, 0.0376, 0.0623, 0.2254, 0.1869, 0.1145, 0.1285],
        [0.2589, 0.0585, 0.0509, 0.1096, 0.1816, 0.1307, 0.0878, 0.1220],
        [0.2537, 0.0529, 0.0497, 0.0905, 0.1618, 0.1494, 0.1226, 0.1194],
        [0.2961, 0.0379, 0.0363, 0.0747, 0.1332, 0.1400, 0.1470, 0.1348],
        [0.3523, 0.0593, 0.0539, 0.1246, 0.0927, 0.0985, 0.1148, 0.1038],
        [0.4308, 0.0470, 0.0442, 0.0800, 0.1329, 0.0896, 0.0855, 0.0902],
        [0.4828, 0.0351, 0.0345, 0.0727, 0.0986, 0.0967, 0.0854, 0.0941],
        [0.6200, 0.0269, 0.0274, 0.0427, 0.0611, 0.0943, 0.0635, 0.0641],
        [0.4311, 0.0487, 0.0473, 0.0967, 0.0824, 0.0991, 0.0947, 0.1000],
        [0.5913, 0.0365, 0.0349, 0.0563, 0.0855, 0.0679, 0.0641, 0.0636],
        [0.5175, 0.0314, 0.0315, 0.0606, 0.0940, 0.0762, 0.0959, 0.0929],
        [0.6360, 0.0249, 0.0259, 0.0395, 0.0632, 0.0615, 0.0751, 0.0739],
        [0.6287, 0.0206, 0.0221, 0.0293, 0.0613, 0.0773, 0.0648, 0.0959]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1089, 0.1411, 0.1078, 0.1095, 0.1423, 0.1567, 0.1190, 0.1148],
        [0.1219, 0.0965, 0.0816, 0.1369, 0.1653, 0.1416, 0.1387, 0.1174],
        [0.1079, 0.1268, 0.1032, 0.1272, 0.1405, 0.1139, 0.1594, 0.1213],
        [0.1224, 0.1036, 0.0951, 0.1307, 0.1558, 0.1215, 0.1427, 0.1282],
        [0.1206, 0.0747, 0.0795, 0.1249, 0.1669, 0.1482, 0.1382, 0.1470],
        [0.1127, 0.1376, 0.1194, 0.1163, 0.1351, 0.1191, 0.1429, 0.1168],
        [0.1394, 0.1199, 0.1145, 0.1135, 0.1406, 0.1290, 0.1293, 0.1139],
        [0.1319, 0.0748, 0.0879, 0.1571, 0.1601, 0.1343, 0.1295, 0.1244],
        [0.1421, 0.0697, 0.0796, 0.1370, 0.1338, 0.1530, 0.1385, 0.1462],
        [0.1171, 0.1196, 0.1029, 0.1236, 0.1432, 0.1629, 0.1017, 0.1291],
        [0.1188, 0.1047, 0.1011, 0.1203, 0.1342, 0.1385, 0.1392, 0.1432],
        [0.1296, 0.0698, 0.0745, 0.1266, 0.1450, 0.1648, 0.1215, 0.1682],
        [0.1368, 0.0641, 0.0712, 0.1166, 0.1675, 0.1568, 0.1337, 0.1533],
        [0.1620, 0.0581, 0.0645, 0.0881, 0.1725, 0.1461, 0.1450, 0.1638]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 46 [   0/390]  Loss: 0.5186 (0.519)  Acc@1: 79.6875 (79.6875)  Acc@5: 100.0000 (100.0000)LR: 1.377e-03
Train: 46 [  50/390]  Loss: 0.1318 (0.244)  Acc@1: 96.8750 (91.5441)  Acc@5: 100.0000 (99.9694)LR: 1.377e-03
Train: 46 [ 100/390]  Loss: 0.1717 (0.238)  Acc@1: 92.1875 (91.7234)  Acc@5: 100.0000 (99.9381)LR: 1.377e-03
Train: 46 [ 150/390]  Loss: 0.1396 (0.228)  Acc@1: 96.8750 (92.0840)  Acc@5: 100.0000 (99.9172)LR: 1.377e-03
Train: 46 [ 200/390]  Loss: 0.2012 (0.229)  Acc@1: 93.7500 (92.0165)  Acc@5: 100.0000 (99.8989)LR: 1.377e-03
Train: 46 [ 250/390]  Loss: 0.2355 (0.228)  Acc@1: 92.1875 (92.0319)  Acc@5: 100.0000 (99.9128)LR: 1.377e-03
Train: 46 [ 300/390]  Loss: 0.1334 (0.231)  Acc@1: 96.8750 (91.8968)  Acc@5: 100.0000 (99.8858)LR: 1.377e-03
Train: 46 [ 350/390]  Loss: 0.2012 (0.232)  Acc@1: 93.7500 (91.8625)  Acc@5: 100.0000 (99.8665)LR: 1.377e-03
Train: 46 [ 390/390]  Loss: 0.4238 (0.234)  Acc@1: 82.5000 (91.7880)  Acc@5: 100.0000 (99.8560)LR: 1.377e-03
train_acc 91.788000
Valid: 46 [   0/390]  Loss: 0.3844 (0.384)  Acc@1: 90.6250 (90.6250)  Acc@5: 98.4375 (98.4375)
Valid: 46 [  50/390]  Loss: 0.6621 (0.342)  Acc@1: 82.8125 (89.0319)  Acc@5: 98.4375 (99.3873)
Valid: 46 [ 100/390]  Loss: 0.2459 (0.355)  Acc@1: 92.1875 (88.2890)  Acc@5: 100.0000 (99.3967)
Valid: 46 [ 150/390]  Loss: 0.3527 (0.352)  Acc@1: 89.0625 (88.5762)  Acc@5: 100.0000 (99.4619)
Valid: 46 [ 200/390]  Loss: 0.6045 (0.352)  Acc@1: 82.8125 (88.3162)  Acc@5: 96.8750 (99.4947)
Valid: 46 [ 250/390]  Loss: 0.4373 (0.345)  Acc@1: 81.2500 (88.5022)  Acc@5: 100.0000 (99.5207)
Valid: 46 [ 300/390]  Loss: 0.3294 (0.350)  Acc@1: 90.6250 (88.3980)  Acc@5: 100.0000 (99.5069)
Valid: 46 [ 350/390]  Loss: 0.3167 (0.354)  Acc@1: 89.0625 (88.3146)  Acc@5: 100.0000 (99.4970)
Valid: 46 [ 390/390]  Loss: 0.5398 (0.353)  Acc@1: 85.0000 (88.3640)  Acc@5: 100.0000 (99.4960)
valid_acc 88.364000
epoch = 46   
 genotype = Genotype(normal=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('sep_conv_5x5', 0), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('dil_conv_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('sep_conv_3x3', 4), ('dil_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1791, 0.0503, 0.0459, 0.0981, 0.2289, 0.1262, 0.1423, 0.1292],
        [0.2051, 0.0417, 0.0374, 0.0617, 0.2257, 0.1868, 0.1137, 0.1279],
        [0.2632, 0.0574, 0.0506, 0.1093, 0.1811, 0.1312, 0.0872, 0.1200],
        [0.2582, 0.0522, 0.0494, 0.0900, 0.1627, 0.1475, 0.1221, 0.1179],
        [0.3025, 0.0373, 0.0360, 0.0739, 0.1321, 0.1381, 0.1460, 0.1340],
        [0.3565, 0.0582, 0.0535, 0.1243, 0.0912, 0.0980, 0.1150, 0.1034],
        [0.4413, 0.0459, 0.0434, 0.0786, 0.1301, 0.0880, 0.0840, 0.0888],
        [0.4915, 0.0346, 0.0342, 0.0716, 0.0974, 0.0948, 0.0835, 0.0923],
        [0.6304, 0.0264, 0.0269, 0.0415, 0.0595, 0.0919, 0.0614, 0.0619],
        [0.4363, 0.0480, 0.0469, 0.0960, 0.0817, 0.0989, 0.0941, 0.0981],
        [0.6003, 0.0357, 0.0344, 0.0553, 0.0835, 0.0662, 0.0626, 0.0620],
        [0.5250, 0.0310, 0.0314, 0.0597, 0.0921, 0.0753, 0.0946, 0.0910],
        [0.6443, 0.0246, 0.0257, 0.0389, 0.0616, 0.0596, 0.0728, 0.0724],
        [0.6356, 0.0206, 0.0221, 0.0295, 0.0595, 0.0758, 0.0634, 0.0935]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1083, 0.1405, 0.1077, 0.1093, 0.1421, 0.1586, 0.1193, 0.1141],
        [0.1224, 0.0957, 0.0807, 0.1372, 0.1655, 0.1438, 0.1376, 0.1172],
        [0.1078, 0.1264, 0.1034, 0.1266, 0.1404, 0.1143, 0.1612, 0.1200],
        [0.1224, 0.1023, 0.0937, 0.1302, 0.1572, 0.1226, 0.1430, 0.1285],
        [0.1206, 0.0736, 0.0796, 0.1241, 0.1675, 0.1491, 0.1378, 0.1476],
        [0.1122, 0.1371, 0.1196, 0.1162, 0.1366, 0.1193, 0.1428, 0.1162],
        [0.1406, 0.1189, 0.1135, 0.1135, 0.1420, 0.1284, 0.1298, 0.1133],
        [0.1317, 0.0742, 0.0882, 0.1573, 0.1609, 0.1326, 0.1301, 0.1251],
        [0.1408, 0.0693, 0.0795, 0.1360, 0.1338, 0.1544, 0.1395, 0.1467],
        [0.1172, 0.1180, 0.1020, 0.1242, 0.1441, 0.1627, 0.1020, 0.1298],
        [0.1189, 0.1035, 0.1001, 0.1197, 0.1347, 0.1404, 0.1386, 0.1441],
        [0.1299, 0.0688, 0.0743, 0.1254, 0.1458, 0.1643, 0.1216, 0.1699],
        [0.1373, 0.0636, 0.0711, 0.1153, 0.1692, 0.1570, 0.1328, 0.1536],
        [0.1625, 0.0576, 0.0644, 0.0868, 0.1732, 0.1469, 0.1439, 0.1646]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 47 [   0/390]  Loss: 0.2807 (0.281)  Acc@1: 92.1875 (92.1875)  Acc@5: 100.0000 (100.0000)LR: 1.213e-03
Train: 47 [  50/390]  Loss: 0.3944 (0.242)  Acc@1: 82.8125 (91.2684)  Acc@5: 100.0000 (99.8775)LR: 1.213e-03
Train: 47 [ 100/390]  Loss: 0.2946 (0.239)  Acc@1: 89.0625 (91.5532)  Acc@5: 98.4375 (99.8453)LR: 1.213e-03
Train: 47 [ 150/390]  Loss: 0.08589 (0.229)  Acc@1: 96.8750 (91.8667)  Acc@5: 100.0000 (99.8551)LR: 1.213e-03
Train: 47 [ 200/390]  Loss: 0.1644 (0.228)  Acc@1: 92.1875 (91.9154)  Acc@5: 100.0000 (99.8445)LR: 1.213e-03
Train: 47 [ 250/390]  Loss: 0.2218 (0.232)  Acc@1: 92.1875 (91.8202)  Acc@5: 100.0000 (99.8444)LR: 1.213e-03
Train: 47 [ 300/390]  Loss: 0.3648 (0.236)  Acc@1: 90.6250 (91.7255)  Acc@5: 100.0000 (99.8235)LR: 1.213e-03
Train: 47 [ 350/390]  Loss: 0.2361 (0.238)  Acc@1: 93.7500 (91.6667)  Acc@5: 100.0000 (99.8397)LR: 1.213e-03
Train: 47 [ 390/390]  Loss: 0.3158 (0.237)  Acc@1: 85.0000 (91.6680)  Acc@5: 100.0000 (99.8520)LR: 1.213e-03
train_acc 91.668000
Valid: 47 [   0/390]  Loss: 0.4638 (0.464)  Acc@1: 81.2500 (81.2500)  Acc@5: 100.0000 (100.0000)
Valid: 47 [  50/390]  Loss: 0.4929 (0.374)  Acc@1: 87.5000 (87.4387)  Acc@5: 96.8750 (99.3566)
Valid: 47 [ 100/390]  Loss: 0.3297 (0.399)  Acc@1: 90.6250 (86.6955)  Acc@5: 100.0000 (99.4740)
Valid: 47 [ 150/390]  Loss: 0.3426 (0.410)  Acc@1: 87.5000 (86.3514)  Acc@5: 100.0000 (99.4309)
Valid: 47 [ 200/390]  Loss: 0.3017 (0.407)  Acc@1: 90.6250 (86.5905)  Acc@5: 100.0000 (99.4403)
Valid: 47 [ 250/390]  Loss: 0.3094 (0.406)  Acc@1: 90.6250 (86.4978)  Acc@5: 100.0000 (99.4335)
Valid: 47 [ 300/390]  Loss: 0.4926 (0.412)  Acc@1: 81.2500 (86.3164)  Acc@5: 98.4375 (99.4238)
Valid: 47 [ 350/390]  Loss: 0.4015 (0.407)  Acc@1: 92.1875 (86.4628)  Acc@5: 98.4375 (99.4347)
Valid: 47 [ 390/390]  Loss: 0.3819 (0.409)  Acc@1: 92.5000 (86.3640)  Acc@5: 97.5000 (99.4160)
valid_acc 86.364000
epoch = 47   
 genotype = Genotype(normal=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('sep_conv_5x5', 0), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('dil_conv_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('sep_conv_3x3', 4), ('dil_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1799, 0.0492, 0.0455, 0.0977, 0.2312, 0.1261, 0.1414, 0.1291],
        [0.2080, 0.0410, 0.0372, 0.0613, 0.2249, 0.1873, 0.1134, 0.1267],
        [0.2687, 0.0559, 0.0497, 0.1076, 0.1806, 0.1320, 0.0862, 0.1192],
        [0.2632, 0.0512, 0.0487, 0.0885, 0.1624, 0.1459, 0.1220, 0.1181],
        [0.3092, 0.0367, 0.0355, 0.0728, 0.1310, 0.1371, 0.1448, 0.1329],
        [0.3641, 0.0571, 0.0527, 0.1237, 0.0898, 0.0964, 0.1133, 0.1030],
        [0.4533, 0.0448, 0.0426, 0.0770, 0.1275, 0.0862, 0.0818, 0.0868],
        [0.4997, 0.0342, 0.0340, 0.0706, 0.0959, 0.0936, 0.0818, 0.0902],
        [0.6387, 0.0261, 0.0265, 0.0408, 0.0584, 0.0896, 0.0599, 0.0600],
        [0.4446, 0.0470, 0.0462, 0.0948, 0.0805, 0.0979, 0.0924, 0.0965],
        [0.6120, 0.0349, 0.0338, 0.0540, 0.0803, 0.0643, 0.0608, 0.0600],
        [0.5327, 0.0306, 0.0312, 0.0588, 0.0904, 0.0738, 0.0933, 0.0892],
        [0.6556, 0.0243, 0.0252, 0.0380, 0.0594, 0.0575, 0.0704, 0.0696],
        [0.6449, 0.0204, 0.0220, 0.0295, 0.0575, 0.0739, 0.0608, 0.0911]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1085, 0.1397, 0.1070, 0.1098, 0.1425, 0.1591, 0.1196, 0.1138],
        [0.1221, 0.0950, 0.0802, 0.1370, 0.1674, 0.1451, 0.1367, 0.1166],
        [0.1081, 0.1257, 0.1034, 0.1251, 0.1407, 0.1141, 0.1635, 0.1194],
        [0.1225, 0.1013, 0.0932, 0.1304, 0.1570, 0.1237, 0.1428, 0.1289],
        [0.1196, 0.0731, 0.0792, 0.1228, 0.1689, 0.1490, 0.1378, 0.1495],
        [0.1123, 0.1361, 0.1187, 0.1167, 0.1358, 0.1203, 0.1431, 0.1169],
        [0.1405, 0.1182, 0.1132, 0.1131, 0.1423, 0.1289, 0.1298, 0.1140],
        [0.1302, 0.0737, 0.0879, 0.1559, 0.1622, 0.1340, 0.1311, 0.1251],
        [0.1403, 0.0689, 0.0796, 0.1355, 0.1334, 0.1552, 0.1400, 0.1471],
        [0.1184, 0.1163, 0.1009, 0.1244, 0.1449, 0.1627, 0.1018, 0.1306],
        [0.1187, 0.1028, 0.1000, 0.1201, 0.1343, 0.1414, 0.1394, 0.1432],
        [0.1307, 0.0683, 0.0741, 0.1247, 0.1455, 0.1640, 0.1216, 0.1711],
        [0.1373, 0.0631, 0.0708, 0.1146, 0.1704, 0.1575, 0.1328, 0.1536],
        [0.1625, 0.0574, 0.0644, 0.0865, 0.1744, 0.1476, 0.1426, 0.1646]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 48 [   0/390]  Loss: 0.3504 (0.350)  Acc@1: 85.9375 (85.9375)  Acc@5: 100.0000 (100.0000)LR: 1.095e-03
Train: 48 [  50/390]  Loss: 0.2563 (0.258)  Acc@1: 95.3125 (90.4718)  Acc@5: 100.0000 (99.8775)LR: 1.095e-03
Train: 48 [ 100/390]  Loss: 0.3867 (0.249)  Acc@1: 82.8125 (90.9963)  Acc@5: 98.4375 (99.7525)LR: 1.095e-03
Train: 48 [ 150/390]  Loss: 0.2227 (0.239)  Acc@1: 96.8750 (91.4011)  Acc@5: 100.0000 (99.8034)LR: 1.095e-03
Train: 48 [ 200/390]  Loss: 0.1308 (0.237)  Acc@1: 95.3125 (91.6045)  Acc@5: 100.0000 (99.8290)LR: 1.095e-03
Train: 48 [ 250/390]  Loss: 0.1197 (0.239)  Acc@1: 96.8750 (91.5090)  Acc@5: 100.0000 (99.8195)LR: 1.095e-03
Train: 48 [ 300/390]  Loss: 0.2374 (0.239)  Acc@1: 90.6250 (91.5646)  Acc@5: 100.0000 (99.8079)LR: 1.095e-03
Train: 48 [ 350/390]  Loss: 0.2925 (0.239)  Acc@1: 85.9375 (91.5687)  Acc@5: 100.0000 (99.8041)LR: 1.095e-03
Train: 48 [ 390/390]  Loss: 0.4990 (0.240)  Acc@1: 82.5000 (91.5360)  Acc@5: 100.0000 (99.8080)LR: 1.095e-03
train_acc 91.536000
Valid: 48 [   0/390]  Loss: 0.3143 (0.314)  Acc@1: 89.0625 (89.0625)  Acc@5: 98.4375 (98.4375)
Valid: 48 [  50/390]  Loss: 0.5081 (0.386)  Acc@1: 82.8125 (86.3971)  Acc@5: 98.4375 (99.4792)
Valid: 48 [ 100/390]  Loss: 0.3975 (0.418)  Acc@1: 84.3750 (85.4270)  Acc@5: 100.0000 (99.3657)
Valid: 48 [ 150/390]  Loss: 0.5196 (0.407)  Acc@1: 84.3750 (86.0824)  Acc@5: 100.0000 (99.2757)
Valid: 48 [ 200/390]  Loss: 0.3729 (0.409)  Acc@1: 89.0625 (86.3417)  Acc@5: 98.4375 (99.2848)
Valid: 48 [ 250/390]  Loss: 0.4543 (0.411)  Acc@1: 84.3750 (86.3608)  Acc@5: 98.4375 (99.2966)
Valid: 48 [ 300/390]  Loss: 0.3763 (0.415)  Acc@1: 85.9375 (86.3164)  Acc@5: 100.0000 (99.3096)
Valid: 48 [ 350/390]  Loss: 0.2618 (0.409)  Acc@1: 87.5000 (86.4717)  Acc@5: 100.0000 (99.3056)
Valid: 48 [ 390/390]  Loss: 0.3798 (0.409)  Acc@1: 87.5000 (86.3520)  Acc@5: 97.5000 (99.3000)
valid_acc 86.352000
epoch = 48   
 genotype = Genotype(normal=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('sep_conv_5x5', 0), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('dil_conv_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('sep_conv_3x3', 4), ('dil_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1795, 0.0480, 0.0450, 0.0966, 0.2335, 0.1265, 0.1414, 0.1294],
        [0.2127, 0.0403, 0.0370, 0.0607, 0.2226, 0.1891, 0.1130, 0.1247],
        [0.2736, 0.0546, 0.0490, 0.1062, 0.1790, 0.1329, 0.0862, 0.1186],
        [0.2687, 0.0501, 0.0479, 0.0870, 0.1626, 0.1452, 0.1219, 0.1167],
        [0.3141, 0.0365, 0.0355, 0.0723, 0.1298, 0.1356, 0.1438, 0.1324],
        [0.3723, 0.0559, 0.0519, 0.1224, 0.0884, 0.0949, 0.1120, 0.1021],
        [0.4659, 0.0435, 0.0415, 0.0748, 0.1260, 0.0841, 0.0799, 0.0845],
        [0.5085, 0.0338, 0.0336, 0.0692, 0.0937, 0.0925, 0.0801, 0.0887],
        [0.6458, 0.0256, 0.0261, 0.0401, 0.0573, 0.0882, 0.0583, 0.0585],
        [0.4537, 0.0458, 0.0453, 0.0931, 0.0793, 0.0975, 0.0908, 0.0945],
        [0.6262, 0.0337, 0.0328, 0.0521, 0.0769, 0.0619, 0.0585, 0.0579],
        [0.5427, 0.0302, 0.0309, 0.0575, 0.0882, 0.0720, 0.0912, 0.0873],
        [0.6658, 0.0237, 0.0248, 0.0373, 0.0580, 0.0554, 0.0679, 0.0671],
        [0.6544, 0.0201, 0.0217, 0.0292, 0.0557, 0.0718, 0.0583, 0.0888]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1092, 0.1377, 0.1058, 0.1103, 0.1425, 0.1606, 0.1205, 0.1135],
        [0.1215, 0.0943, 0.0798, 0.1374, 0.1688, 0.1453, 0.1366, 0.1163],
        [0.1083, 0.1241, 0.1028, 0.1249, 0.1395, 0.1154, 0.1655, 0.1195],
        [0.1224, 0.0999, 0.0922, 0.1301, 0.1584, 0.1249, 0.1428, 0.1294],
        [0.1193, 0.0724, 0.0788, 0.1216, 0.1710, 0.1499, 0.1367, 0.1504],
        [0.1126, 0.1347, 0.1181, 0.1163, 0.1367, 0.1211, 0.1435, 0.1169],
        [0.1400, 0.1176, 0.1127, 0.1135, 0.1438, 0.1287, 0.1299, 0.1138],
        [0.1309, 0.0734, 0.0879, 0.1561, 0.1609, 0.1350, 0.1302, 0.1256],
        [0.1395, 0.0689, 0.0799, 0.1354, 0.1334, 0.1554, 0.1400, 0.1475],
        [0.1189, 0.1143, 0.0997, 0.1248, 0.1459, 0.1653, 0.1009, 0.1302],
        [0.1181, 0.1022, 0.0999, 0.1201, 0.1336, 0.1412, 0.1409, 0.1439],
        [0.1310, 0.0678, 0.0737, 0.1238, 0.1458, 0.1641, 0.1217, 0.1721],
        [0.1366, 0.0625, 0.0706, 0.1139, 0.1716, 0.1596, 0.1324, 0.1528],
        [0.1621, 0.0568, 0.0640, 0.0858, 0.1756, 0.1481, 0.1415, 0.1662]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
Train: 49 [   0/390]  Loss: 0.1623 (0.162)  Acc@1: 95.3125 (95.3125)  Acc@5: 100.0000 (100.0000)LR: 1.024e-03
Train: 49 [  50/390]  Loss: 0.1252 (0.238)  Acc@1: 93.7500 (91.6973)  Acc@5: 100.0000 (99.8775)LR: 1.024e-03
Train: 49 [ 100/390]  Loss: 0.2588 (0.248)  Acc@1: 90.6250 (91.2438)  Acc@5: 100.0000 (99.8453)LR: 1.024e-03
Train: 49 [ 150/390]  Loss: 0.3297 (0.251)  Acc@1: 87.5000 (91.1631)  Acc@5: 100.0000 (99.8448)LR: 1.024e-03
Train: 49 [ 200/390]  Loss: 0.2682 (0.250)  Acc@1: 85.9375 (91.2313)  Acc@5: 98.4375 (99.8368)LR: 1.024e-03
Train: 49 [ 250/390]  Loss: 0.2767 (0.249)  Acc@1: 89.0625 (91.2724)  Acc@5: 100.0000 (99.8444)LR: 1.024e-03
Train: 49 [ 300/390]  Loss: 0.1541 (0.249)  Acc@1: 92.1875 (91.2635)  Acc@5: 100.0000 (99.8339)LR: 1.024e-03
Train: 49 [ 350/390]  Loss: 0.3243 (0.249)  Acc@1: 89.0625 (91.2927)  Acc@5: 98.4375 (99.8353)LR: 1.024e-03
Train: 49 [ 390/390]  Loss: 0.2095 (0.246)  Acc@1: 90.0000 (91.3680)  Acc@5: 100.0000 (99.8360)LR: 1.024e-03
train_acc 91.368000
Valid: 49 [   0/390]  Loss: 0.5602 (0.560)  Acc@1: 79.6875 (79.6875)  Acc@5: 100.0000 (100.0000)
Valid: 49 [  50/390]  Loss: 0.3035 (0.400)  Acc@1: 87.5000 (86.2745)  Acc@5: 100.0000 (99.3873)
Valid: 49 [ 100/390]  Loss: 0.4761 (0.417)  Acc@1: 84.3750 (85.8447)  Acc@5: 100.0000 (99.3193)
Valid: 49 [ 150/390]  Loss: 0.4086 (0.418)  Acc@1: 87.5000 (85.9168)  Acc@5: 98.4375 (99.3584)
Valid: 49 [ 200/390]  Loss: 0.5392 (0.420)  Acc@1: 84.3750 (85.8753)  Acc@5: 100.0000 (99.3004)
Valid: 49 [ 250/390]  Loss: 0.3744 (0.422)  Acc@1: 85.9375 (85.7694)  Acc@5: 100.0000 (99.2903)
Valid: 49 [ 300/390]  Loss: 0.4339 (0.423)  Acc@1: 87.5000 (85.9012)  Acc@5: 98.4375 (99.2577)
Valid: 49 [ 350/390]  Loss: 0.3152 (0.425)  Acc@1: 87.5000 (85.8351)  Acc@5: 98.4375 (99.2967)
Valid: 49 [ 390/390]  Loss: 0.6506 (0.423)  Acc@1: 77.5000 (85.9880)  Acc@5: 100.0000 (99.2800)
valid_acc 85.988000
epoch = 49   
 genotype = Genotype(normal=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('sep_conv_5x5', 0), ('dil_conv_3x3', 2)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 2), ('dil_conv_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_5x5', 3), ('sep_conv_3x3', 4), ('dil_conv_5x5', 2)], reduce_concat=range(2, 6))
alphas_normal = 
 tensor([[0.1805, 0.0471, 0.0448, 0.0962, 0.2338, 0.1272, 0.1414, 0.1291],
        [0.2155, 0.0397, 0.0368, 0.0602, 0.2235, 0.1882, 0.1126, 0.1236],
        [0.2768, 0.0533, 0.0487, 0.1057, 0.1780, 0.1332, 0.0858, 0.1185],
        [0.2740, 0.0490, 0.0476, 0.0861, 0.1624, 0.1444, 0.1206, 0.1158],
        [0.3174, 0.0362, 0.0354, 0.0716, 0.1288, 0.1349, 0.1436, 0.1321],
        [0.3791, 0.0552, 0.0517, 0.1225, 0.0868, 0.0929, 0.1111, 0.1007],
        [0.4766, 0.0425, 0.0409, 0.0734, 0.1239, 0.0817, 0.0785, 0.0825],
        [0.5167, 0.0335, 0.0335, 0.0683, 0.0923, 0.0910, 0.0782, 0.0863],
        [0.6520, 0.0254, 0.0262, 0.0398, 0.0561, 0.0868, 0.0564, 0.0572],
        [0.4628, 0.0448, 0.0449, 0.0923, 0.0776, 0.0963, 0.0891, 0.0924],
        [0.6357, 0.0329, 0.0323, 0.0510, 0.0745, 0.0607, 0.0568, 0.0560],
        [0.5513, 0.0298, 0.0305, 0.0567, 0.0865, 0.0708, 0.0887, 0.0856],
        [0.6728, 0.0234, 0.0247, 0.0369, 0.0570, 0.0541, 0.0656, 0.0656],
        [0.6624, 0.0199, 0.0216, 0.0292, 0.0539, 0.0699, 0.0560, 0.0871]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
 alphas_reduct = 
 tensor([[0.1092, 0.1371, 0.1055, 0.1108, 0.1429, 0.1612, 0.1199, 0.1134],
        [0.1218, 0.0938, 0.0795, 0.1375, 0.1682, 0.1471, 0.1361, 0.1160],
        [0.1082, 0.1233, 0.1026, 0.1255, 0.1392, 0.1144, 0.1668, 0.1200],
        [0.1223, 0.0990, 0.0917, 0.1304, 0.1594, 0.1254, 0.1427, 0.1291],
        [0.1195, 0.0718, 0.0789, 0.1212, 0.1719, 0.1514, 0.1362, 0.1492],
        [0.1129, 0.1339, 0.1181, 0.1164, 0.1365, 0.1204, 0.1445, 0.1172],
        [0.1401, 0.1174, 0.1132, 0.1135, 0.1440, 0.1282, 0.1295, 0.1142],
        [0.1306, 0.0727, 0.0878, 0.1555, 0.1620, 0.1335, 0.1324, 0.1255],
        [0.1388, 0.0686, 0.0800, 0.1347, 0.1335, 0.1548, 0.1415, 0.1481],
        [0.1192, 0.1128, 0.0987, 0.1250, 0.1482, 0.1650, 0.0999, 0.1312],
        [0.1184, 0.1019, 0.1004, 0.1201, 0.1322, 0.1417, 0.1412, 0.1440],
        [0.1317, 0.0673, 0.0734, 0.1229, 0.1455, 0.1638, 0.1218, 0.1736],
        [0.1358, 0.0621, 0.0703, 0.1125, 0.1728, 0.1610, 0.1315, 0.1540],
        [0.1610, 0.0567, 0.0638, 0.0848, 0.1761, 0.1501, 0.1404, 0.1673]],
       device='cuda:0', grad_fn=<SoftmaxBackward0>)
